Raku / old-issue-tracker

Tickets from RT
https://github.com/Raku/old-issue-tracker/issues
2 stars 1 forks source link

methods for accessing binary data in Buf objects #3561

Open p6rt opened 10 years ago

p6rt commented 10 years ago

Migrated from rt.perl.org#123015 (status was 'open')

Searchable as RT123015$

p6rt commented 10 years ago

From alex.hartmaier@gmail.com

As discussed on IRC mainly with moritz I'd need a way to get the number of bytes, not elements, of a Buf object so it can be looped and each byte accessed with $buf.[$idx]. I'm requiring this in DBDish​::Oracle for passing UTF-16 encoded values and their byte-length to the OCI C library using NativeCall.

p6rt commented 10 years ago

From @moritz

On Mon Oct 20 06​:54​:16 2014, abraxxa wrote​:

As discussed on IRC mainly with moritz I'd need a way to get the number of bytes, not elements, of a Buf object so it can be looped and each byte accessed with $buf.[$idx]. I'm requiring this in DBDish​::Oracle for passing UTF-16 encoded values and their byte-length to the OCI C library using NativeCall.

Another use case​: cryptography, which typically works on the byte level, even for multi-byte encodings.

p6rt commented 7 years ago

From @skids

On Mon, 20 Oct 2014 07​:01​:07 -0700, moritz wrote​:

On Mon Oct 20 06​:54​:16 2014, abraxxa wrote​:

As discussed on IRC mainly with moritz I'd need a way to get the number of bytes, not elements, of a Buf object so it can be looped and each byte accessed with $buf.[$idx]. I'm requiring this in DBDish​::Oracle for passing UTF-16 encoded values and their byte-length to the OCI C library using NativeCall.

Another use case​: cryptography, which typically works on the byte level, even for multi-byte encodings.

A lot of time has passed and we have has the first part for quite some time.

$ perl6 -e '.bytes.say for buf8.new(1,2,3), buf16.new(1,2,3), buf32.new(1,2,3), buf64.new(1,2,3)' 3 6 12 24

...which is both specced and already tested. Also as a Container type you get .of​:

$ perl6 -e '.of.say for Buf[int8].new(1,2,3), Buf[uint16].new(1,2,3), Buf[int32], Buf[uint64]' (int8) (uint16) (int32) (uint64)

...and the native types can be nativesizeof'd

$ perl6 -e 'use NativeCall; nativesizeof(.of).say for Buf[int8].new(1,2,3), Buf[uint16].new(1,2,3), Buf[int32], Buf[uint64]' 1 2 4 8

Also there are ways to finagle the second need with NativeCall​:

$ perl6 -e 'use NativeCall; my $b16 = buf16.new(1,2,3); my $b8 = nativecast(CArray[uint8], $b16); $b8[^6].say' (1 0 2 0 3 0)

...though internally this relies on some GC manipulations and the details on exactly if/when it becomes fragile aren't documented.

Granted a lot more could be done to make this, and endianness conversions easier, e.g. safe and optimized indexing adverbs like buf32.new(1,2)[^8]​:swab8 might be nice, as well as having types that carry their endianness information around with them, but I think that falls in the 6.d-and-later design discussion category rather than RT.

So I'd vote to 'resolve' this ticket, and maybe if the next person who picks it up agrees, they should do so.

p6rt commented 7 years ago

The RT System itself - Status changed from 'new' to 'open'