dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.98k stars 4.66k forks source link

When publishing a Blazor client application, the build intermittently fails during the EmccCompile task with cmd.exe returning an error code of 255. #94263

Open antmjones opened 10 months ago

antmjones commented 10 months ago

Is there an existing issue for this?

Describe the bug

When publishing a Blazor client application, the build intermittently fails during the EmccCompile task with cmd.exe returning an error code of 255.

This appears to be due to a bug in cmd.exe, triggered by redirection of stdin when launching Python using the ... < NUL syntax in emcc.bat.

Setting the EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG environment variable to 1 is a workaround for me (note that I am using Windows 10 not Windows 7).

Example output from msbuild during failure:

[pinvoke.c] Exit code: 255 (logged at debug level)

Followed by:

Failed to compile C:\Program Files\dotnet\packs\Microsoft.NETCore.App.Runtime.Mono.browser-wasm\7.0.12\runtimes\browser-wasm\native\src\pinvoke.c -> C:\[...]\obj\Release\net7.0\wasm\for-publish\pinvoke.o
...

Note that the actual emcc build has not actually failed (confirmed by adding a @echo %ERRORLEVEL line to emcc.bat), rather this is cmd.exe returning an incorrect error code to msbuild. I have also confirmed that Python is returning an error code of 0 and cmd.exe is returning 255 using Sysinternals ProcMon.

Expected Behavior

The build should not intermittently fail.

Steps To Reproduce

The following reproduces the cmd.exe issue for me from a .NET 4 Console application. Note that running the executable from Windows Terminal reproduces the bug much more consistently for me (vs. using "Command Prompt"):

https://github.com/antmjones/CmdExe255Test

The issue is very intermittent for me when running the publish from msbuild, but is nonetheless quite annoying when it occurs near the end of a publish that takes >10 minutes!

Updating the batch file to use exit 0 rather than exit /b 0 returns the correct error code to the calling process, confirming that setting EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG to 1 is a suitable workaround.

Exceptions (if any)

[pinvoke.c] Exit code: 255

Is the "smoking gun", logged by Utils.TryRunProcess() in C:\Program Files\dotnet\packs\Microsoft.NET.Runtime.WebAssembly.Sdk\7.0.12\tasks\net472\WasmAppBuilder.dll

.NET Version

7.0.403

Anything else?

I have mentioned the issue to the emscripten team here:

https://github.com/emscripten-core/emscripten/issues/20583

The effect of the EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG environment variable can be observed here:

https://github.com/emscripten-core/emscripten/blob/75b0a5f53726ea8671ef9f942de0ca1d1ba3cf0c/emcc.bat#L49

Note that it causes the batch file to be exited with @exit %ERRORLEVEL% rather than @exit /b %ERRORLEVEL%.

The PR that added this workaround to emcc.bat is https://github.com/emscripten-core/emscripten/pull/15146.

This environment variable could be added to EmscriptenEnvVars (used in WasmApp.Native.targets) for example.

ghost commented 10 months ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details
### Is there an existing issue for this? - [X] I have searched the existing issues ### Describe the bug When publishing a Blazor client application, the build intermittently fails during the ```EmccCompile``` task with ```cmd.exe``` returning an error code of 255. This appears to be due to a bug in ```cmd.exe```, triggered by redirection of stdin when launching Python using the ```... < NUL``` syntax in ```emcc.bat```. Setting the ```EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG``` environment variable to ```1``` is a workaround for me (note that I am using Windows 10 not Windows 7). Example output from msbuild during failure: ```[pinvoke.c] Exit code: 255``` (logged at debug level) Followed by: ``` Failed to compile C:\Program Files\dotnet\packs\Microsoft.NETCore.App.Runtime.Mono.browser-wasm\7.0.12\runtimes\browser-wasm\native\src\pinvoke.c -> C:\[...]\obj\Release\net7.0\wasm\for-publish\pinvoke.o ... ``` Note that the actual ```emcc``` build has *not* actually failed (confirmed by adding a ```@echo %ERRORLEVEL``` line to ```emcc.bat```), rather this is ```cmd.exe``` returning an incorrect error code to msbuild. I have also confirmed that Python is returning an error code of 0 and ```cmd.exe``` is returning 255 using Sysinternals ProcMon. ### Expected Behavior The build should not intermittently fail. ### Steps To Reproduce The following reproduces the ```cmd.exe``` issue for me from a .NET 4 Console application. Note that running the executable from Windows Terminal reproduces the bug much more consistently for me (vs. using "Command Prompt"): https://github.com/antmjones/CmdExe255Test The issue is *very* intermittent for me when running the publish from msbuild, but is nonetheless quite annoying when it occurs near the end of a publish that takes >10 minutes! Updating [the batch file](https://github.com/antmjones/CmdExe255Test/blob/master/CmdExe255Test/Test.bat) to use ```exit 0``` rather than ```exit /b 0``` returns the correct error code to the calling process, confirming that setting ```EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG``` to ```1``` is a suitable workaround. ### Exceptions (if any) ```[pinvoke.c] Exit code: 255``` Is the "smoking gun", logged by ```Utils.TryRunProcess()``` in ```C:\Program Files\dotnet\packs\Microsoft.NET.Runtime.WebAssembly.Sdk\7.0.12\tasks\net472\WasmAppBuilder.dll``` ### .NET Version 7.0.403 ### Anything else? I have mentioned the issue to the emscripten team here: https://github.com/emscripten-core/emscripten/issues/20583 The effect of the ```EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG``` environment variable can be observed here: https://github.com/emscripten-core/emscripten/blob/75b0a5f53726ea8671ef9f942de0ca1d1ba3cf0c/emcc.bat#L49 Note that it causes the batch file to be exited with ```@exit %ERRORLEVEL%``` rather than ```@exit /b %ERRORLEVEL%```. The PR that added this workaround to ```emcc.bat``` is https://github.com/emscripten-core/emscripten/pull/15146. This environment variable could be added to ```EmscriptenEnvVars``` (used in ```WasmApp.Native.targets```) for example.
Author: antmjones
Assignees: -
Labels: `arch-wasm`, `untriaged`, `area-Build-mono`, `needs-area-label`
Milestone: -
lewing commented 10 months ago

@antmjones thank you for the detailed report.

@radekdoulik please take a look.

ilonatommy commented 2 months ago

I have no luck with reproduction. @antmjones does it happen in net8 as well? Do you maybe have a binlog from the failed publish? (dotnet publish -bl)

antmjones commented 2 months ago

Hi, I can provide a recent .binlog (supplied by a collegue who encountered the same issue). I would note however, that I did study the binlog very carefully at the time (trying to work out what was causing the issue), and all of the useful information is included above in the original bug report. The relevant output in MSBuild Structured Log Viewer looks like this, there is really nothing more of interest there:

image

Note that (as described in the original report) the real bug appears to be in cmd.exe, not msbuild or the EmccCompile task, but there seems to be little hope of getting cmd.exe fixed, so given there is a simple workaround already provided (by the EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG environment variable) the sensible thing seems to me to be just to set that environment variable by default when calling emcc.bat. It simply causes emcc.bat to use @exit %ERRORLEVEL% rather than @exit /b %ERRORLEVEL%, which seems like a low risk change.

I can also confirm that the issue still occurs in .NET 8 - my collegue was using .NET 8, with up-to-date VS and tooling etc (8.0.6 SDK). I have not encountered the issue myself since the original bug report since setting EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG.

Have you tried the reproduction I provided at https://github.com/antmjones/CmdExe255Test? This reproduces the error far more consistently for me than via msbuild. I also found that the bug occurred more frequently when running from Windows Terminal rather than the default "command prompt" terminal:

image

(as can be seen in the image, the sample console application is reporting that cmd.exe returned an exit code of 255, but the last line of the batch file called exit /b 0, so the exit code should have been zero)

If you do still want a copy of the .binlog file could you provide an email address or other way to share the .binlog file privately?

david-flett commented 1 month ago

As advised above, I had to add this workaround to my csproj file in order for it to compile reliably: EmscriptenEnvVars Include="EM_WORKAROUND_WIN7_BAD_ERRORLEVEL_BUG=1"