envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.55k stars 4.74k forks source link

Envoy Windows crashes on startup due to access violation (0xc0000005) #34545

Open dimo414 opened 2 months ago

dimo414 commented 2 months ago

Note: this issue was previously reported to envoy-security@googlegroups.com and approved for posting publicly

We run Envoy on Windows desktops, and a user is encountering unexpected crashes as envoy.exe starts up. It is difficult for us to reproduce locally, but it is easily repeatable on their machine(s).

We are currently on 1.27.0 and using the precompiled binary made available via the docker image, as recommended here. The issue appears to happen on 1.28 (the latest Windows release) as well.

What we have observed is Envoy exits immediately with a 3221225477 (hex: 0xc0000005) access violation exit status, with no output to stdout/err.

We next tried running the envoy binary by hand, and observed it crashes even with trivial arguments like --version or incorrect arguments that should have caused an argument parsing error. We found a .wer error log file, the salient details of which are:

Sig[0].Name=Application name
Sig[0].Value=envoy.exe
Sig[1].Name=Application version
Sig[1].Value=0.0.0.0
Sig[2].Name=Application time stamp
Sig[2].Value=64c15a94
Sig[3].Name=Failure Module Name
Sig[3].Value=ntdll.dll
Sig[4].Name=Failure Module Name version
Sig[4].Value=10.0.19041.3996
Sig[5].Name=Failure Module Name time stamp
Sig[5].Value=39215800
Sig[6].Name=Exception code
Sig[6].Value=c0000005
Sig[7].Name=Exception offset
Sig[7].Value=00000000000634f6

The user has CrowdStrike installed on their machine, and disabling this appears to allow Envoy to run as expected, so we think this is a compatibility issue with CrowdStrike. Some Googling found several relevant-seeming discussions about CrowdStrike causing unexpected crashes and incompatibilities, including:

I realize that Windows support is mothballed at the moment, so my primary motivation for filing this bug is to aid discovery should anyone else encounter a similar crash. Should Windows support be restored it would be helpful if Envoy was able to detect this situation and fail more gracefully than the current access violation crashes.

adisuissa commented 2 months ago

Thanks for providing this info! As you've noted Windows is no longer officially supported. If anyone has access to a windows env with CrowdStrike, and is willing to fix this, we can try to assist.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

dimo414 commented 1 month ago

I do not think this should be closed as stale. Even if Windows is unsupported, the issue remains; closing it will be misleading to others encountering this issue.

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

dimo414 commented 3 weeks ago

Arguing with stalebot is the clearest example of toil I can think of.