skeeto / w64devkit

Portable C and C++ Development Kit for x64 (and x86) Windows
The Unlicense
3.1k stars 217 forks source link

sh is borked on 2.0.0 #162

Open clin1234 opened 2 months ago

clin1234 commented 2 months ago

Upstream: https://github.com/rmyorston/busybox-w32/issues/447

~/scoop/apps/w64devkit/current $ which ls
sh: unable to spawn shell
~/scoop/apps/w64devkit/current $ bash
sh: unable to spawn shell
~/scoop/apps/w64devkit/current $ cd ..\..
sh: cd: can't cd to ....: Invalid argument
~/scoop/apps/w64devkit/current $ cd ..\
> /
sh: cd: can't cd to ../: Invalid argument
~/scoop/apps/w64devkit/current $ cd ..
sh: cd: can't cd to ..: Invalid argument
~/scoop/apps/w64devkit/current $ cd ../..
sh: cd: can't cd to ../..: Invalid argument
~/scoop/apps/w64devkit/current $ env
sh: unable to spawn shell
~/scoop/apps/w64devkit/current $  cd
sh: cd: can't cd to C:/Users/????: Invalid argument
~/scoop/apps/w64devkit/current $
~/scoop/apps/w64devkit/current $ cd
sh: cd: can't cd to C:/Users/????: Invalid argument

If it helps, I'm running Windows 11 21H2.

avih commented 2 months ago

You probably need the unicode variant of busybox-w32 for that, but w64dk distributes a non-unicode build, because the unicode build doesn't run on windows XP.

You probably can't use the unicode variant (from https://frippery.org/busybox/) as a drop-in replacement for your existing busybox.exe, because w64dk uses a custom build which disables some utilities in busybox (like make).

But with some effort, you can replace busybox.exe with the unicode busybox variant, and exclude those utilities using the env var:

BB_OVERRIDE_APPLETS=";ar make strings xxd"

I'm not entirely sure which applets w64dk disables in busybox, but whatever they are, you should be able to add them to BB_OVERRIDE_APPLETS and then the unicode busybox.exe should work in w64dk.

ale5000-git commented 2 months ago

@avih Has anyone experimented with an external manifest for Win XP compatibility?

skeeto commented 2 months ago

I believe @avih is correct. Alternatively extract w64devkit to an ASCII-only real path. Even with Unicode-aware busybox-w32, other tools in the kit (e.g. Binutils) aren't wide-path capable and still may not work correctly on non-ASCII paths.

skeeto commented 2 months ago

@ale5000-git, see the discussion on 4f201b21f7d66eb328f027cd104e6a05fcf35f6f. An external manifest works (with the usual SxS caching caveats), as does injecting it into the binary later, but I currently don't enable Unicode editing in the build, so you'll still have trouble editing non-ASCII commands. My comment on the MR that "interactive UTF-8 editing is still too broken" was incorrect, and addressed in the discussion. I just needed to enable an additional option to get the correct behavior.

avih commented 2 months ago

@skeeto maybe you should build a binary which is closer to upstream, at least as far as disabling applets go, and then exclude them using env vars as suggested above, so that it's easier for user to use different or newer busybox.exe binary?

For instance if some critical issue is detected and upstream releases a new binary which fixes it - users can't use it as drop-in replacement for their current busybox.exe in w64dk, but it would be nice if they could.

ale5000-git commented 2 months ago

Why not provide an external UTF-8 manifest for a separate manual download? Even if it doesn't cover all cases, only users that know will use it.

avih commented 2 months ago

Why not provide an external UTF-8 manifest for a separate manual download?

skeeto already replied to that, and additionally currently the busybox-w32 code needs to enable unicode at build time. I.e it doesn't support adding/removing the manifest with the same binary (though it should be possible, but currently the code doesn't do that).

However, it should still be possible to distribute an additional unicode build. True that it's not fun, but it would be the simplest. Or just ensure that upstream unicode busybox.exe binary can be used as a drop-in replacement, and instruct users to do that if they need.

ale5000-git commented 2 months ago

skeeto already replied to that, and additionally currently the busybox-w32 code needs to enable unicode at build time. I.e it doesn't support adding/removing the manifest with the same binary

I have only talked about an external manifest file, not to be merged but just placed near the exe. In most cases the user doesn't need to input UTF-8 in the interactive shell so it will fix most issues (together with chcp 65001).

rmyorston commented 2 months ago

@skeeto maybe you should build a binary which is closer to upstream, at least as far as disabling applets go, and then exclude them using env vars as suggested above, so that it's easier for user to use different or newer busybox.exe binary?

Having w64devkit define a suitable BB_OVERRIDE_APPLETS so an alternative binary could be a drop-in replacement would be neat, but it wouldn't require any change to the w64devkit build of busybox-w32.

avih commented 2 months ago

it wouldn't require any change to the w64devkit build of busybox-w32.

Sure, and I mentioned how to do that with existing setup. But it would need less patches in w64dk, which I consider to be better.

rmyorston commented 2 months ago

I'm not entirely sure which applets w64dk disables in busybox

Running something like this from an upstream busybox.exe will show the excluded applets:

comm -2 -3 <(busybox --list) <(w64devkit/bin/busybox --list)

Using that information we can make w64devkit/bin/etc/profile like so:

export BB_OVERRIDE_APPLETS='[[ ar ascii dpkg dpkg-deb ftpget ftpput link make ma
n pdpmake rpm rpm2cpio strings tsort unlink vi xxd'

. "$W64DEVKIT_HOME/src/profile"

Then replacing the w64devkit busybox.exe with an upstream one (PRE-5404 or above) should work. The new profile script will only be run by an upstream busybox.exe (hence the requirement for PRE-5404).

The overridden applets are unconditionally excluded to avoid unexpected surprises. They remain present, so if we want to use pdpmake it can be done either as busybox pdpmake or by copying the w64devkit BusyBox alias executable to pdpmake.exe.

If @skeeto could be prevailed upon to set BB_OVERRIDE_APPLETS in w64devkit/src/profile and I could be prevailed upon to support w64devkit/src/profile in the upstream shell, the additional profile script would be unnecessary and replacing busybox.exe would be about as seamless as possible.

I'd certainly find that useful for testing purposes.

avih commented 2 months ago

I could be prevailed upon to support w64devkit/src/profile in the upstream shell

What upstream support would be needed? to support a build-time embedded profile file?

If yes, how would such upstream support help to use an upstream binary as a drop-in replacement?

If no, do you mean w64dk-specific things in upstream busybox.exe? FWIW, I don't think that's appropriate.

But mainly, w64dk already sets some env vars without any help of busybox-w32 that I know of (via the launcher), which I think would also work with upstream binary, so I don't think I'm seeing the need for any upstream suppport to add one more env var.

rmyorston commented 2 months ago

What upstream support would be needed?

To run w64devkit/src/profile.

If yes, how would such upstream support help to use an upstream binary as a drop-in replacement?

It wouldn't be necessary to provide a w64devkit/bin/etc/profile.

But mainly, w64dk already sets some env vars without any help of busybox-w32 that I know of (via the launcher)

Yes, and it sets some more via w64devkit/src/profile. Having a profile script that's relative to the binary allows the user to apply customisations without having to use an absolute path, thus making the installation self-contained.

avih commented 2 months ago

What upstream support would be needed?

To run w64devkit/src/profile.

You mean to add an additional default relative path like this?

$(dirname `which busybox.exe`)/../src/profile

Having a profile script that's relative to the binary allows the user to apply customisations without having to use an absolute path, thus making the installation self-contained.

Yes, but this already works upstream, no?

If relative etc/profile already works in upstream, then why can't that be used instead? I.e. why would upstream need to add a w64dk-specific path rather than w64dk using a supported generic relative path?

I'd think both are equally simple to implement, but one of them keeps bb-w32 more generic than the other approach.

rmyorston commented 2 months ago

As things stand w64devkit and busybox-w32 use different paths for their binary-relative profile scripts: ../src/profile and ./etc/profile respectively.

I'd think both are equally simple to implement, but one of them keeps bb-w32 more generic than the other approach.

And the other requires w64devkit to put its script in a different place than it does now.

Since I've already volunteered Chris to add the BB_OVERRIDE_APPLETS environment variable, I'm volunteering to make the change to profile support in busybox-w32.

avih commented 2 months ago

And the other requires w64devkit to put its script in a different place than it does now.

I don't think ../src/ is a good generic place to put init scripts, and even more so to have upstream bb-w32 look there by default only because w64dk looks there.

To me it looks like w64devkit/src/ is abused to hold the runtime profile file only because this dir happens to exist during build time. All other files there are the source files of w64dk itself, and unrelated to runtime usage.

I'd think that If anything, you two should agree on a good generic relative path which w64dk can be comfortable to place init scripts, e.g. maybe ../etc/profile (which is more in-line with the unix convention that bin and etc are siblings) in addition to the currently-supported ./etc/profile, and then everyone can be happy.

skeeto commented 2 months ago

I would prefer to set the profile variables in w64devkit.exe, but the names are case sensitive (ac_executable_extensions, build_alias). The busybox-w32 shell capitalizes all incoming variable names on start so these particular variables cannot be set external to the shell. The new "system" profile is merely a workaround for this behavior, and I don't plan to use it generally to configure the shell. PATH_SEPARATOR could be set in w64devkit.exe since it's capitalized, but I kept it with the others for organizational purposes, and because it's useless by itself.

If I switch to BB_OVERRIDE_APPLETS I'd set it in w64devkit.exe, not in profile. I'm considering doing so, except that it wouldn't cover all the cases I care about. If someone uses the shell, e.g. to run a shell script, without passing through w64devkit.exe — or sh -l in the profile case — they won't get the variable, and the shell will override tools like ar and make. (It's acceptable that such shells cannot run Autoconf builds out of the box — a special feature of the w64devkit.exe entrypoint, not the shell itself.)

Regarding style and convention, I agree that ../etc/profile makes the most sense for a system-level profile, especially upstream, more so than etc/profile and ../src/profile. If busybox-w32 already did this when I wrote my patch, I probably would have just used it instead. However, the new w64dk profile is just a workaround for otherwise setting variables in src/w64devkit.c, and putting it in src/ saves a Docker layer (I only get 127 of them! And each additional layer slows the others.).

ale5000-git commented 2 months ago

I would prefer to set the profile variables in w64devkit.exe, but the names are case sensitive (ac_executable_extensions, build_alias). The busybox-w32 shell capitalizes all incoming variable names on start so these particular variables cannot be set external to the shell. The new "system" profile is merely a workaround for this behavior, and I don't plan to use it generally to configure the shell. PATH_SEPARATOR could be set in w64devkit.exe since it's capitalized, but I kept it with the others for organizational purposes, and because it's useless by itself.

Just an idea but it is also possible to create a shell file that set variables and load it from both w64devkit.exe and etc/profile, so it will work in every case.

clin1234 commented 2 months ago

Bit late of an update: the shell can run commands when this checkbox is ticked

image
ale5000-git commented 2 months ago

Bit late of an update: the shell can run commands when this checkbox is ticked

This will maybe fix one problem and generate 100 more, since it is global and the UTF-8 codepage doesn't work properly in many cases.

But you can get a similar thing locally in a shell by running chcp 65001.

skeeto commented 1 month ago

Today I realized a one-liner busybox-w32 patch would improve Unicode support without downsides: 3cacd962. On x64, GCC has had a UTF-8 manifest since w64dk 1.19 (first release with GCC 13), and this hasn't been a problem for Windows 7, the oldest target the x64 build supports. Unlike XP, Windows 7 (and earlier versions of Windows 10, and presumably Windows 8) loads the EXE but ignores the manifest. That's fine because ignoring the manifest means it gracefully degrades to the prior behavior (i.e. nothing lost). I can get the same from busybox-w32 by disabling the UTF-8 code page check.

(However, I must again emphasize that parts of w64dk still do not understand Unicode paths and arguments: Binutils, GDB, Make, and Ctags.)

avih commented 1 month ago

Today I realized a one-liner busybox-w32 patch would improve Unicode support without downsides

IIRC not quite.

While such mode is considered (utf8 manifest which falls back to ansi where the manifest is no-op, e.g. win7/8), the current code needs few changes before this can work, beyond the test to explicitly exit which you patched, because IIRC there's some code which hardcodes dependency on build time manifest. IIRC related to utf8 input/output (i.e. KB input, and unicode console output).

So I expect this would not behave entirely correctly in ANSI mode on win7/8, mainly related to KB input and console output, but without recalling the specifics currently.

To be more specific, I think pure ascii-7 input and output would work, but chars outside ascii-7 (in the OEM/ANSI codepage and console input/output codepage) which work-ish in non-unicode build might be broken in this auto-unicode build.

skeeto commented 1 month ago

Thanks for the heads up, @avih! I'll keep an eye out for issues. I didn't catch anything in the source, and the original commit (https://github.com/rmyorston/busybox-w32/commit/9e2c3594) indicates the check is to "avoid raising false hopes." For me personally, mostly working beats flat refusal, and quiet is better than nagging, e.g. a hypothetical "warning: UTF8 is disabled!" on every start. (I've already accumulated four anti-nagging patches, and Cppcheck used to have the fifth.)

avih commented 1 month ago

mostly working beats flat refusal, and quiet is better than nagging, e.g. a hypothetical "warning: UTF8 is disabled!"

I don't disagree, but as I mentioned, I think this will be broken for non-ascii chars I/O, which supposedly work correctly in the non-unicode build.

It should not be too hard to fix, and I think we should do that (upstream) to support auto-unicode mode, but for now it's not yet supported fully.

"supposedly" because IMO the non-unicode build is not perfect in this regard, but it still should be better than your patch.

Also, without looking closely at the patch context, I think the unicode support report at the output of busybox might be misleading in some cases.

avih commented 1 month ago

FYI current upstream discussion about "auto-unicode" starts here https://github.com/rmyorston/busybox-w32/issues/423#issuecomment-2342044684