getsentry / sentry-unreal

Unreal Engine
https://docs.sentry.io/platforms/unreal/
MIT License
82 stars 33 forks source link

Intermittent "os error 32" ("file being used by another process") #542

Open GardensSiobhan opened 4 months ago

GardensSiobhan commented 4 months ago

Environment

How do you use Sentry? SaaS

Which version of the SDK? 0.16.0

How did you install the package? https://github.com/getsentry/sentry-unreal/releases/download/0.16.0/sentry-unreal-0.16.0-engine5.3-github.zip

Which version of Unreal? 5.3.2

Is this happening in Unreal (editor) or on a player like Android, iOS, Windows? During automatic symbol upload as part of a UAT build.

Steps to Reproduce

  1. Sentry plugin settings in the project are configured to upload debug symbols automatically.
  2. A TeamCity build agent fetches latest and launches a build via RunUAT.bat.
  3. PostBuild-2.bat runs, invoking sentry-cli.

Expected Result

Symbols should be successfully uploaded with no errors.

Actual Result

About 50% to 75% of the time, symbols are uploaded without incident. However, on random builds, we get this:


INFO    2024-04-17 20:29:01.733753300 +00:00 Issuing a command for Organization: organization-slug Project: project-slug
error: The process cannot access the file because it is being used by another process. (os error 32)
Sentry: Upload finished```

### Any logs or screenshots
tustanivsky commented 4 months ago

@GardensSiobhan Assuming that the TeamCity build agent is a Windows machine please try changing sentry-cli logging level to debug in <SentryPluginRoot>/Scripts/upload-debug-symbols-win.bat line:102 to get some extra info that might help us in troubleshooting this issue.

bitsandfoxes commented 4 months ago

changing sentry-cli logging level to debug in /Scripts/upload-debug-symbols-win.bat line:102

Can we make that work so that the level is taken from the Sentry settings in the Editor?

tustanivsky commented 4 months ago

Can we make that work so that the level is taken from the Sentry settings in the Editor?

That's a good suggestion. I think adding a corresponding setting shouldn't be too hard.

mxbi commented 1 month ago

@GardensSiobhan did you ever solve this? Seeing exactly the same issue on our end. Although I've had it happen 3 times out of 3, v0.17.1 on UE 5.3.2.

Will try with v0.18 on monday and report back with more detailed logs.

tustanivsky commented 1 month ago

@mxbi In 0.18.0 we've added a new setting allowing you to specify log details level for symbol upload (ProjectSettings -> Sentry -> DebugSymbols -> DiagnosticLevel) so maybe changing it to Trace/Debug can help us to get some extra details.

mxbi commented 1 month ago

@tustanivsky Thanks, I enabled this, here is the relevant part of the log with potential PII stripped:

Sentry: Upload started using PropertiesFile 'C:\Users\Mikel\perforce\[...]\LyraStarterGame\sentry.properties'
C:\Users\Mikel\perforce\[...]\LyraStarterGame\Binaries\Win64
C:\Users\Mikel\perforce\[...]\LyraStarterGame\Plugins\Sentry\Source\ThirdParty\Win64
  INFO    2024-06-24 10:53:48.203159 +01:00 Loaded file referenced by SENTRY_PROPERTIES (C:\Users\Mikel\perforce\[...]\LyraStarterGame\sentry.properties)
  DEBUG   2024-06-24 10:53:48.203348900 +01:00 sentry-cli version: 2.32.1, platform: "windows", architecture: "x86_64"
  INFO    2024-06-24 10:53:48.203375400 +01:00 sentry-cli was invoked with the following command line: "C:\Users\Mikel\perforce\[...]\LyraStarterGame\Plugins\Sentry\Source\ThirdParty\CLI\sentry-cli-Windows-x86_64.exe" "upload-dif" "--include-sources" "--log-level" "Trace" "C:\Users\Mikel\perforce\[...]\LyraStarterGame\Binaries\Win64" "C:\Users\Mikel\perforce\[...]\LyraStarterGame\Plugins\Sentry\Source\ThirdParty\Win64"
  INFO    2024-06-24 10:53:48.203408200 +01:00 Issuing a command for Organization: [...]
  DEBUG   2024-06-24 10:53:48.203793800 +01:00 request GET https://sentry.io/api/0/organizations/[...]/chunk-upload/
  DEBUG   2024-06-24 10:53:48.203812900 +01:00 using token authentication
  DEBUG   2024-06-24 10:53:48.203822300 +01:00 retry number 0, max retries: 0
  DEBUG   2024-06-24 10:53:48.223754 +01:00 > GET /api/0/organizations/[...]/chunk-upload/ HTTP/1.1
[long http headers that look fine]
  DEBUG   2024-06-24 10:53:48.611434300 +01:00 < x-served-by: getsentry-web-rpc-production-57bd868c67-9qslb
  DEBUG   2024-06-24 10:53:48.611464300 +01:00 < x-served-by: frontend-default-65dd9bd965-x2hf5
  DEBUG   2024-06-24 10:53:48.611470100 +01:00 < strict-transport-security: max-age=31536000; includeSubDomains; preload
  DEBUG   2024-06-24 10:53:48.611477500 +01:00 < via: 1.1 google
  DEBUG   2024-06-24 10:53:48.611483 +01:00 < Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
  DEBUG   2024-06-24 10:53:48.611497200 +01:00 < Transfer-Encoding: chunked
  DEBUG   2024-06-24 10:53:48.612411300 +01:00 response status: 200
  DEBUG   2024-06-24 10:53:48.612427100 +01:00 body: {"url":"https://de.sentry.io/api/0/organizations/[...]/chunk-upload/","chunkSize":8388608,"chunksPerRequest":64,"maxFileSize":2147483648,"maxRequestSize":33554432,"concurrency":8,"hashAlgorithm":"sha1","compression":["gzip"],"accept":["debug_files","release_files","pdbs","sources","bcsymbolmaps","il2cpp","portablepdbs","artifact_bundles","artifact_bundles_v2"]}

error: The process cannot access the file because it is being used by another process. (os error 32)
  DEBUG   2024-06-24 10:53:48.828496100 +01:00 skipping update nagger because session is not attended

Sentry: Upload finished

It wasn't obvious to me what the error was from the logs, so I also ran procmon to trace the file accesses of the CLI while this was running. At 10:53:48.81, Sentry tried to open the LyraGame-Win64-Shipping.exe file and received a SHARING VIOLATION result: image

I'm not an expert but this is the only system call that received that result and I was able to reproduce this twice, so I suspect this is somehow the culprit. Looking at all the syscalls on this file:

image

Looks like link.exe still holds the file open for writing at the point when Sentry tries to access it. Not sure why this is the case but hopefully this is helpful!

mxbi commented 1 month ago

Actually, I think I know what it is:

Determining max actions to execute in parallel (16 physical cores, 32 logical cores)
  Executing up to 16 processes, one per physical core
------ Building 111 action(s) started ------

[...]

[105/111] Link [x64] LyraGame.exe
   Creating library C:\Users\Mikel\perforce\ProjectAnt\LyraStarterGame\Binaries\Win64\LyraGame.lib and object C:\Users\Mikel\perforce\ProjectAnt\LyraStarterGame\Binaries\Win64\LyraGame.exp
[106/111] WriteMetadata LyraGame.target
[107/111] cmd Executing post build script (PostBuild-1.bat)
Sentry: Start debug symbols upload

For some reason, Sentry is run in parallel with the actual build process as one of the actions. It should be run at the very end once compilation is complete.

tustanivsky commented 1 month ago

@mxbi Thank you for sharing these details! It's odd that the Sentry post-build script somehow starts symbol upload before the actual build is finished which I have never seen happen previously within our test environment.

Is there a chance that there are any other plugins included in your project that are up to some post-build processing?

mxbi commented 1 month ago

@tustanivsky There aren't any other such plugins, it looks like Sentry is the only post-build script that gets executed.

However, this issue doesn't occur when building in the editor, just when building in CI. We use a script based on https://github.com/botman99/ue4-unreal-automation-tool/blob/main/PackageGame.bat which uses the Unreal Automation Tool for builds - so this might be the place to start looking. Have you tried building your test environment this way?

tustanivsky commented 1 month ago

@tustanivsky There aren't any other such plugins, it looks like Sentry is the only post-build script that gets executed.

However, this issue doesn't occur when building in the editor, just when building in CI. We use a script based on https://github.com/botman99/ue4-unreal-automation-tool/blob/main/PackageGame.bat which uses the Unreal Automation Tool for builds - so this might be the place to start looking. Have you tried building your test environment this way?

I'll give it another round of testing with the script you've referenced, thank you.

mxbi commented 1 month ago

Hey @tustanivsky did you get anywhere with this? We've had to stop using Sentry for the last few weeks but would love to get it re-enabled!

tustanivsky commented 1 month ago

Hey @mxbi, sorry for the late update on this issue and thank you for your patience. I've managed to reproduce it on my side by using the script you mentioned earlier and confirmed that as long as UAT is packaging the game for multiple configurations (i.e. Development+Shipping+DebugGame) our Sentry CLI tool causes the above build errors.

The workaround which seemed to fix this behavior for me was to add --force-foreground parameter to symbol upload command during the plugin's post-build processing. Basically, it prevents symbol upload process from being detached and running in a background which allows us to make sure that the build for current configuration is actually finished before we move on to the next one.

However I've encountered with another problem after applying this change - for some reason source bundles stopped getting uploaded to Sentry due to the file could not be read as UTF-8 error. Currently, the only way to avoid it is to disable 'Upload sources' option in plugin settings but in this case you won't be able to view extra source code context directly in stack trace.

I'll continue looking into that and let you know if anything else comes up. Also, it would be great if you can try out the suggested fix and confirm it's a viable solution.

NoCodeBugsFree commented 1 month ago

for me this os error 32 happens each time TC agent do a clean build or build with source code updated when new binaries up. but it only happens only if I build multiple configurations once. e.g. development+shipping. If I do singe - 100percent build will success. so for me its only about building multiple configurations once

tustanivsky commented 1 month ago

The workaround which seemed to fix this behavior for me was to add --force-foreground parameter to symbol upload command during the plugin's post-build processing. Basically, it prevents symbol upload process from being detached and running in a background which allows us to make sure that the build for current configuration is actually finished before we move on to the next one.

@NoCodeBugsFree Have you tried to add --force-foreground parameter to Sentry CLI command invocation?

NoCodeBugsFree commented 4 weeks ago

@tustanivsky --force-foreground gives the same result 08:55:33  INFO 2024-07-20 08:55:33.076082900 +03:00 sentry-cli was invoked with the following command line: "...\Plugins\Sentry\Source\ThirdParty\CLI\sentry-cli-Windows-x86_64.exe" "upload-dif" "--log-level" "info" "...\Binaries\Win64" "...\Plugins\Sentry\Source\ThirdParty\Win64" "--force-foreground" 08:55:33  INFO 2024-07-20 08:55:33.076125100 +03:00 Issuing a command for Organization: ....... 08:55:33  error: The process cannot access the file because it is being used by another process. (os error 32)

p.s. but I am not on latest release now.

tustanivsky commented 3 weeks ago

@NoCodeBugsFree I'm unsure if this behavior has something to do with plugin version as there were no recent changes related to symbol upload. --force-foreground worked for me when packaging a game with the script mentioned above. Do you use the same one?