Open clin1234 opened 2 months ago
You probably need the unicode variant of busybox-w32 for that, but w64dk distributes a non-unicode build, because the unicode build doesn't run on windows XP.
You probably can't use the unicode variant (from https://frippery.org/busybox/) as a drop-in replacement for your existing busybox.exe, because w64dk uses a custom build which disables some utilities in busybox (like make
).
But with some effort, you can replace busybox.exe with the unicode busybox variant, and exclude those utilities using the env var:
BB_OVERRIDE_APPLETS=";ar make strings xxd"
I'm not entirely sure which applets w64dk disables in busybox, but whatever they are, you should be able to add them to BB_OVERRIDE_APPLETS and then the unicode busybox.exe should work in w64dk.
@avih Has anyone experimented with an external manifest for Win XP compatibility?
I believe @avih is correct. Alternatively extract w64devkit to an ASCII-only real path. Even with Unicode-aware busybox-w32, other tools in the kit (e.g. Binutils) aren't wide-path capable and still may not work correctly on non-ASCII paths.
@ale5000-git, see the discussion on 4f201b21f7d66eb328f027cd104e6a05fcf35f6f. An external manifest works (with the usual SxS caching caveats), as does injecting it into the binary later, but I currently don't enable Unicode editing in the build, so you'll still have trouble editing non-ASCII commands. My comment on the MR that "interactive UTF-8 editing is still too broken" was incorrect, and addressed in the discussion. I just needed to enable an additional option to get the correct behavior.
@skeeto maybe you should build a binary which is closer to upstream, at least as far as disabling applets go, and then exclude them using env vars as suggested above, so that it's easier for user to use different or newer busybox.exe binary?
For instance if some critical issue is detected and upstream releases a new binary which fixes it - users can't use it as drop-in replacement for their current busybox.exe in w64dk, but it would be nice if they could.
Why not provide an external UTF-8 manifest for a separate manual download? Even if it doesn't cover all cases, only users that know will use it.
Why not provide an external UTF-8 manifest for a separate manual download?
skeeto already replied to that, and additionally currently the busybox-w32 code needs to enable unicode at build time. I.e it doesn't support adding/removing the manifest with the same binary (though it should be possible, but currently the code doesn't do that).
However, it should still be possible to distribute an additional unicode build. True that it's not fun, but it would be the simplest. Or just ensure that upstream unicode busybox.exe binary can be used as a drop-in replacement, and instruct users to do that if they need.
skeeto already replied to that, and additionally currently the busybox-w32 code needs to enable unicode at build time. I.e it doesn't support adding/removing the manifest with the same binary
I have only talked about an external manifest file, not to be merged but just placed near the exe. In most cases the user doesn't need to input UTF-8 in the interactive shell so it will fix most issues (together with chcp 65001).
@skeeto maybe you should build a binary which is closer to upstream, at least as far as disabling applets go, and then exclude them using env vars as suggested above, so that it's easier for user to use different or newer busybox.exe binary?
Having w64devkit define a suitable BB_OVERRIDE_APPLETS
so an alternative binary could be a drop-in replacement would be neat, but it wouldn't require any change to the w64devkit build of busybox-w32.
it wouldn't require any change to the w64devkit build of busybox-w32.
Sure, and I mentioned how to do that with existing setup. But it would need less patches in w64dk, which I consider to be better.
I'm not entirely sure which applets w64dk disables in busybox
Running something like this from an upstream busybox.exe
will show the excluded applets:
comm -2 -3 <(busybox --list) <(w64devkit/bin/busybox --list)
Using that information we can make w64devkit/bin/etc/profile
like so:
export BB_OVERRIDE_APPLETS='[[ ar ascii dpkg dpkg-deb ftpget ftpput link make ma
n pdpmake rpm rpm2cpio strings tsort unlink vi xxd'
. "$W64DEVKIT_HOME/src/profile"
Then replacing the w64devkit busybox.exe
with an upstream one (PRE-5404 or above) should work. The new profile script will only be run by an upstream busybox.exe
(hence the requirement for PRE-5404).
The overridden applets are unconditionally excluded to avoid unexpected surprises. They remain present, so if we want to use pdpmake
it can be done either as busybox pdpmake
or by copying the w64devkit BusyBox alias executable to pdpmake.exe
.
If @skeeto could be prevailed upon to set BB_OVERRIDE_APPLETS
in w64devkit/src/profile
and I could be prevailed upon to support w64devkit/src/profile
in the upstream shell, the additional profile script would be unnecessary and replacing busybox.exe
would be about as seamless as possible.
I'd certainly find that useful for testing purposes.
I could be prevailed upon to support
w64devkit/src/profile
in the upstream shell
What upstream support would be needed? to support a build-time embedded profile file?
If yes, how would such upstream support help to use an upstream binary as a drop-in replacement?
If no, do you mean w64dk-specific things in upstream busybox.exe? FWIW, I don't think that's appropriate.
But mainly, w64dk already sets some env vars without any help of busybox-w32 that I know of (via the launcher), which I think would also work with upstream binary, so I don't think I'm seeing the need for any upstream suppport to add one more env var.
What upstream support would be needed?
To run w64devkit/src/profile
.
If yes, how would such upstream support help to use an upstream binary as a drop-in replacement?
It wouldn't be necessary to provide a w64devkit/bin/etc/profile
.
But mainly, w64dk already sets some env vars without any help of busybox-w32 that I know of (via the launcher)
Yes, and it sets some more via w64devkit/src/profile
. Having a profile script that's relative to the binary allows the user to apply customisations without having to use an absolute path, thus making the installation self-contained.
What upstream support would be needed?
To run
w64devkit/src/profile
.
You mean to add an additional default relative path like this?
$(dirname `which busybox.exe`)/../src/profile
Having a profile script that's relative to the binary allows the user to apply customisations without having to use an absolute path, thus making the installation self-contained.
Yes, but this already works upstream, no?
If relative etc/profile
already works in upstream, then why can't that be used instead? I.e. why would upstream need to add a w64dk-specific path rather than w64dk using a supported generic relative path?
I'd think both are equally simple to implement, but one of them keeps bb-w32 more generic than the other approach.
As things stand w64devkit and busybox-w32 use different paths for their binary-relative profile scripts: ../src/profile
and ./etc/profile
respectively.
I'd think both are equally simple to implement, but one of them keeps bb-w32 more generic than the other approach.
And the other requires w64devkit to put its script in a different place than it does now.
Since I've already volunteered Chris to add the BB_OVERRIDE_APPLETS
environment variable, I'm volunteering to make the change to profile support in busybox-w32.
And the other requires w64devkit to put its script in a different place than it does now.
I don't think ../src/
is a good generic place to put init scripts, and even more so to have upstream bb-w32 look there by default only because w64dk looks there.
To me it looks like w64devkit/src/
is abused to hold the runtime profile file only because this dir happens to exist during build time. All other files there are the source files of w64dk itself, and unrelated to runtime usage.
I'd think that If anything, you two should agree on a good generic relative path which w64dk can be comfortable to place init scripts, e.g. maybe ../etc/profile
(which is more in-line with the unix convention that bin
and etc
are siblings) in addition to the currently-supported ./etc/profile
, and then everyone can be happy.
I would prefer to set the profile
variables in w64devkit.exe
, but the names are case sensitive (ac_executable_extensions
, build_alias
). The busybox-w32 shell capitalizes all incoming variable names on start so these particular variables cannot be set external to the shell. The new "system" profile
is merely a workaround for this behavior, and I don't plan to use it generally to configure the shell. PATH_SEPARATOR
could be set in w64devkit.exe
since it's capitalized, but I kept it with the others for organizational purposes, and because it's useless by itself.
If I switch to BB_OVERRIDE_APPLETS
I'd set it in w64devkit.exe
, not in profile
. I'm considering doing so, except that it wouldn't cover all the cases I care about. If someone uses the shell, e.g. to run a shell script, without passing through w64devkit.exe
— or sh -l
in the profile
case — they won't get the variable, and the shell will override tools like ar
and make
. (It's acceptable that such shells cannot run Autoconf builds out of the box — a special feature of the w64devkit.exe
entrypoint, not the shell itself.)
Regarding style and convention, I agree that ../etc/profile
makes the most sense for a system-level profile, especially upstream, more so than etc/profile
and ../src/profile
. If busybox-w32 already did this when I wrote my patch, I probably would have just used it instead. However, the new w64dk profile
is just a workaround for otherwise setting variables in src/w64devkit.c
, and putting it in src/
saves a Docker layer (I only get 127 of them! And each additional layer slows the others.).
I would prefer to set the
profile
variables inw64devkit.exe
, but the names are case sensitive (ac_executable_extensions
,build_alias
). The busybox-w32 shell capitalizes all incoming variable names on start so these particular variables cannot be set external to the shell. The new "system"profile
is merely a workaround for this behavior, and I don't plan to use it generally to configure the shell.PATH_SEPARATOR
could be set inw64devkit.exe
since it's capitalized, but I kept it with the others for organizational purposes, and because it's useless by itself.
Just an idea but it is also possible to create a shell file that set variables and load it from both w64devkit.exe and etc/profile, so it will work in every case.
Bit late of an update: the shell can run commands when this checkbox is ticked
Bit late of an update: the shell can run commands when this checkbox is ticked
This will maybe fix one problem and generate 100 more, since it is global and the UTF-8 codepage doesn't work properly in many cases.
But you can get a similar thing locally in a shell by running chcp 65001
.
Today I realized a one-liner busybox-w32 patch would improve Unicode support without downsides: 3cacd962. On x64, GCC has had a UTF-8 manifest since w64dk 1.19 (first release with GCC 13), and this hasn't been a problem for Windows 7, the oldest target the x64 build supports. Unlike XP, Windows 7 (and earlier versions of Windows 10, and presumably Windows 8) loads the EXE but ignores the manifest. That's fine because ignoring the manifest means it gracefully degrades to the prior behavior (i.e. nothing lost). I can get the same from busybox-w32 by disabling the UTF-8 code page check.
(However, I must again emphasize that parts of w64dk still do not understand Unicode paths and arguments: Binutils, GDB, Make, and Ctags.)
Today I realized a one-liner busybox-w32 patch would improve Unicode support without downsides
IIRC not quite.
While such mode is considered (utf8 manifest which falls back to ansi where the manifest is no-op, e.g. win7/8), the current code needs few changes before this can work, beyond the test to explicitly exit which you patched, because IIRC there's some code which hardcodes dependency on build time manifest. IIRC related to utf8 input/output (i.e. KB input, and unicode console output).
So I expect this would not behave entirely correctly in ANSI mode on win7/8, mainly related to KB input and console output, but without recalling the specifics currently.
To be more specific, I think pure ascii-7 input and output would work, but chars outside ascii-7 (in the OEM/ANSI codepage and console input/output codepage) which work-ish in non-unicode build might be broken in this auto-unicode build.
Thanks for the heads up, @avih! I'll keep an eye out for issues. I didn't catch anything in the source, and the original commit (https://github.com/rmyorston/busybox-w32/commit/9e2c3594) indicates the check is to "avoid raising false hopes." For me personally, mostly working beats flat refusal, and quiet is better than nagging, e.g. a hypothetical "warning: UTF8 is disabled!" on every start. (I've already accumulated four anti-nagging patches, and Cppcheck used to have the fifth.)
mostly working beats flat refusal, and quiet is better than nagging, e.g. a hypothetical "warning: UTF8 is disabled!"
I don't disagree, but as I mentioned, I think this will be broken for non-ascii chars I/O, which supposedly work correctly in the non-unicode build.
It should not be too hard to fix, and I think we should do that (upstream) to support auto-unicode mode, but for now it's not yet supported fully.
"supposedly" because IMO the non-unicode build is not perfect in this regard, but it still should be better than your patch.
Also, without looking closely at the patch context, I think the unicode support report at the output of busybox
might be misleading in some cases.
FYI current upstream discussion about "auto-unicode" starts here https://github.com/rmyorston/busybox-w32/issues/423#issuecomment-2342044684
Upstream: https://github.com/rmyorston/busybox-w32/issues/447
If it helps, I'm running Windows 11 21H2.