Open aheirman opened 3 years ago
that one's my fault... i did this deliberately because i thought the vim xxd behavior was too C-centric and i wanted to be able to use it for other languages. but i should probably have just matched the vim xxd and made a start on hexdump (which is what most people seem to use to solve this particular "binary to hex array" problem, if the Android codebase is anything to go by).
i'll send a patch tomorrow if rob doesn't say he's already done it...
On Wed, Jun 23, 2021 at 5:11 AM aheirman @.***> wrote:
As far as I can tell the upstream for the default xxd is vim, with an example of the expected xxd -i behavior from there ( https://vim.fandom.com/wiki/Hex_dump#Generating_C_source_for_a_binary_file )
xxd -i in toybox just gives the bytes converted to hex with , in between without the required c. In the example provided toybox xxd -i would give
0x00, 0x01, 0x02, 0x03, 0x30, 0x31, 0x32, 0x33, 0x04, 0x05, 0x06, 0x07, 0x44, 0x45, 0x46, 0x47, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
and not the expected:
unsigned char sample_bin[] = { 0x00, 0x01, 0x02, 0x03, 0x30, 0x31, 0x32, 0x33, 0x04, 0x05, 0x06, 0x07, 0x44, 0x45, 0x46, 0x47, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f }; unsigned int sample_bin_len = 32;
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/landley/toybox/issues/286, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMVLEWAZST3KFVFS53AMOBLTUHFPBANCNFSM47FVJRLA .
On 6/23/21 7:08 PM, enh-google wrote:
that one's my fault... i did this deliberately because i thought the vim xxd behavior was too C-centric and i wanted to be able to use it for other languages. but i should probably have just matched the vim xxd and made a start on hexdump (which is what most people seem to use to solve this particular "binary to hex array" problem, if the Android codebase is anything to go by).
i'll send a patch tomorrow if rob doesn't say he's already done it...
Have -C produce the prefix/suffix and -i by itself do the current behavior?
It will make Denys sad, but isn't less conformant than what we have now (and thus not a regression).
Rob
Have -C produce the prefix/suffix
Do you mean xxd -C
or xxd -C -i
because xxd -C -i
would be 'more' compliant.
Although I would prefer xxd -i
to behave as expected, since I've used it in build scripts and I'm not alone(https://github.com/caramelli/yagears, don't want to look for more examples).
I don't know if many people/projects use xxd -i
in the toybox way.
annoyingly, the default vim xxd output isn't ideal for C either --- the use of int rather than size_t causes warnings, and there's no easy way to mark both variables static or const or visibility hidden or...
looking at the callers in Android that seem to rely on the current behavior, it doesn't look like many (4?), and since it's clearly my fault i'm happy to own cleaning up the mess.
how does making xxd -i
behave like vim xxd, but adding a flag something
like --bare to not include any boilerplate? (since although we could add
flags for golang or whatever alongside one for C, like i said: there's no
One True Answer for C anyway, so i'd assume other languages have equivalent
problems too. and even if they don't have those problems, anecdotally i'd
say most callers don't like the way vim xxd chooses a name for the
variables, and sed their own in after the fact anyway![1])
if we do go this way, i'll probably try to get in a no-op --bare before i change -i, because that'll make the transition easier for me. only two of them seem to be used as part of the build anyway --- people seem to be writing scripts but checking in the [hand-modified] output :-(
-include
longopt.On Thu, Jun 24, 2021 at 7:17 AM aheirman @.***> wrote:
Have -C produce the prefix/suffix
Do you mean xxd -C or xxd -C -i because xxd -C -i would be 'more' compliant.
Although I would prefer xxd -i to behave as expected, since I've used it in build scripts and I'm not alone(https://github.com/caramelli/yagears, don't want to look for more examples).
I don't know if many people/projects use xxd -i in the toybox way.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/landley/toybox/issues/286#issuecomment-867674147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMVLEWHGJA4QWFL6VICAYDTTUM47VANCNFSM47FVJRLA .
On 6/24/21 9:17 AM, aheirman wrote:
Have -C produce the prefix/suffix
Do you mean |xxd -C| or |xxd -C -i| because |xxd -C -i| would be 'more' compliant.
I'd meant -C would override -i if both were provided, but I'm not a user of this tool...
Although I would prefer |xxd -i| to behave as expected, since I've used it in build scripts and I'm not alone(https://github.com/caramelli/yagears https://github.com/caramelli/yagears, don't want to look for more examples).
I don't know if many people/projects use |xxd -i| in the toybox way.
That's a queston for Elliott, for builds internally within android.
Another option is to make -i behave conventionally and -I do the unwrapped version, but I dunno what (if any) android build scripts would need to be adjusted for this? Hmmm...
system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i string.bfbs > string_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i integer.bfbs > integer_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i float.bfbs > float_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i struct.bfbs > struct_bfbs.h device/generic/vulkan-cereal/third-party/angle/src/libANGLE/renderer/metal/shaders/gen_mtl_internal_shaders.py: os.system('xxd -i compiled/default.ios_sim.metallib >> compiled/mtl_default_shaders.inc')
rs/gen_mtl_internal_shaders.py: os.system('xxd -i compiled/default.ios.metallib >> compiled/mtl_default_shaders.inc')
looks like grep -r is finding quite a few. (Denys Vlasenko's recent admonition comes to mind...)
Rob
On Thu, Jun 24, 2021 at 9:56 AM Rob Landley @.***> wrote:
On 6/24/21 9:17 AM, aheirman wrote:
Have -C produce the prefix/suffix
Do you mean |xxd -C| or |xxd -C -i| because |xxd -C -i| would be 'more' compliant.
I'd meant -C would override -i if both were provided, but I'm not a user of this tool...
Although I would prefer |xxd -i| to behave as expected, since I've used it in build scripts and I'm not alone(https://github.com/caramelli/yagears https://github.com/caramelli/yagears, don't want to look for more examples).
I don't know if many people/projects use |xxd -i| in the toybox way.
That's a queston for Elliott, for builds internally within android.
Another option is to make -i behave conventionally and -I do the unwrapped version, but I dunno what (if any) android build scripts would need to be adjusted for this? Hmmm...
system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i string.bfbs > string_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i integer.bfbs > integer_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i float.bfbs > float_bfbs.h system/bt/gd/dumpsys/internal/test_data/mkfiles:xxd -i struct.bfbs > struct_bfbs.h
yeah, but if you look at those files, it's one of the examples i mentioned where they've run this once manually and checked in the results (and were clearly using the vim xxd). so this particular example would be "fixed" by making toybox xxd -i output the boilerplate.
device/generic/vulkan-cereal/third-party/angle/src/libANGLE/renderer/metal/shaders/gen_mtl_internal_shaders.py: os.system('xxd -i compiled/default.ios_sim.metallib >> compiled/mtl_default_shaders.inc')
likewise. the constexpr
on the line above is to mark the array const (but
there's nothing they could [easily] do about the size). but, again, they
just ran this once and checked in the output.
looks like grep -r is finding quite a few. (Denys Vlasenko's recent admonition comes to mind...)
yeah. i think i made this mistake in xxd before i started switching the Android build over. didn't really plan ahead there.
but like i said (a) i made this mess so it's only fair if i have to clean it up [though obviously i can't clean up non Android stuff, but presumably there isn't much of that?] and (b) afaict there are only actually 3-4 places where the toybox xxd is actually in "live" use anyway, if you ignore places like those above where someone ran a script manually and checked in the (sometimes hand-edited!) result.
Rob
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/landley/toybox/issues/286#issuecomment-867800663, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMVLEWE5UTLHWX74QKHFMT3TUNPSLANCNFSM47FVJRLA .
cc-ing the toybox mailing list on this reply. And I also pinged http://lists.busybox.net/pipermail/busybox/2021-June/088931.html because it's just too on the nose not too. (When you're right, you're right...)
On 6/24/21 11:21 AM, enh-google wrote:
annoyingly, the default vim xxd output isn't ideal for C either --- the use of int rather than size_t causes warnings, and there's no easy way to mark both variables static or const or visibility hidden or...
The existence of xxd -C and -u to micromanage the -i output seems kind of awkward as well. (And xxd -E is just inexcusable.)
What you've implemented is compatible with python and java and so on. Heck, probably fortran. CSV is a lingua franca. Heck, bash can grok it:
$ toybox xxd -i README | while read -d, i; do printf \${i:1}; done
It's definitely the way they SHOULD have done it in the first place, but unfortunately they didn't. The problem is compatibility with existing tools that do extra stupid stuff by default...
looking at the callers in Android that seem to rely on the current behavior, it doesn't look like many (4?), and since it's clearly my fault i'm happy to own cleaning up the mess.
Yay cleaning.
I lean towards adding a capital version of the existing option (-I) that skips the header/trailer stuff, and we can even poke the vim guys about it to make Denys happy. :)
The "compatible" alternative is to add --gratuitous-longopt-with-no-short which is less of a contested namespace. Of course it still breaks EXACTLY the same way if you run AOSP with the vim tool instead of the toybox one, and by that logic nobody would ever add another short option again.
how does making
xxd -i
behave like vim xxd, but adding a flag something like --bare to not include any boilerplate? (since although we could add flags for golang or whatever alongside one for C,
Oh please no. It's trivial to wrap the xxd output with header/footer lines, and that's the "unix" way of doing things. Simple tools that connect together with pipelines. As much of a fan of lua as I am, I'm not adding explicit support for it to xxd.
The c output is grandfathered in, hysterical raisins etc. We can poke vim (and again, busybox) to add the new "bare" output type.
like i said: there's no One True Answer for C anyway,
It wasn't xxd's job to get C output syntax right in the first place, trying to get it right for MORE languages is not going to decrease the entropic attack surface.
so i'd assume other languages have equivalent problems too. and even if they don't have those problems, anecdotally i'd say most callers don't like the way vim xxd chooses a name for the variables, and sed their own in after the fact anyway![1])
Indeed. I vote for "xxd -I" doing the unwrapped version but also recuse myself from actual decision making here because I'm not a regular user of this tool and Elliott is (and he has an existing userbase of scripts to migrate, the only regression tests I have are artificial ones in tests/xxd.test which prove nothing in this sort of circumstance).
if we do go this way, i'll probably try to get in a no-op --bare before i change -i, because that'll make the transition easier for me.
How about we add -I as a synonym for -i and then switch toybox over to have -i produce the conventional C output in a couple weeks?
I expect Denys is going to make us contact vim before busybox about the new option this time, but I'm personally ok with that because vim isn't a gnu project nor is it GPLv3. (Heck, its developer says on his web page he can be outright bought: https://www.vim.org/sponsor/index.php .)
Although I expect "sending him a patch" is likely to work. :)
only two of them seem to be used as part of the build anyway --- people seem to be writing scripts but checking in the [hand-modified] output :-(
Do I have a https://landley.net/toybox/cleanup.html#advice link about not checking in generated files? No? I'm sure I've ranted about it before. Possibly in my blog.
I was in favor of it as the lesser evil when the linux kernel used it to avoid having kconfig depend on lex, and toybox has kconfig/*_shipped files inherited from that. But that's ALSO another reason to reimplement the kconfig directory from scratch.
Rob
On 6/24/21 12:50 PM, Rob Landley wrote:
I lean towards adding a capital version of the existing option (-I) that skips the header/trailer stuff, and we can even poke the vim guys about it to make Denys happy. :)
boggle
I started to patch the vim one and... it already supports this? It won't produce header/footer output for stdin:
$ echo "look ma no header/footer" | xxd -i 0x6c, 0x6f, 0x6f, 0x6b, 0x20, 0x6d, 0x61, 0x20, 0x6e, 0x6f, 0x20, 0x68, 0x65, 0x61, 0x64, 0x65, 0x72, 0x2f, 0x66, 0x6f, 0x6f, 0x74, 0x65, 0x72, 0x0a
And both the busybox and toybox ones already get this right?
We can still add -I if you like, but... seems less vital now? More like a documentation update.
Rob
As far as I can tell the upstream for the default xxd is vim, with an example of the expected
xxd -i
behavior from there ( https://vim.fandom.com/wiki/Hex_dump#Generating_C_source_for_a_binary_file )xxd -i
in toybox just gives the bytes converted to hex with,
in between without the required c. In the example provided toyboxxxd -i
would giveand not the expected: