Open lewing opened 1 month ago
It seems like all reports are pointing to linux-x64 dev-innerloop leg from this definition: https://dev.azure.com/dnceng-public/public/_build?definitionId=133. GitHub doesn't sync the status and keeps showing as if the job is running for days.. Opened dotnet/runtime#108581 to disable the leg.
Just before the timeout we see low memory warnings like these:
then it hangs for ~20 minutes or so before giving up. The build command has -allConfigurations
so it builds all product+test assemblies for all platforms ({linux,win,osx,freebsd,illumos}-{x86,x64,arm,arm64,riscv64.. etc.}) all in one invocation of build (which isn't exactly efficient as we should probably group them..), which means, as it stands, this leg needs decent amount of RAM.
@ilyas1974 @markwilkie I don't think our RAM consumption has increased that much to not be able to handle this configuration. Thoughts?
I took a look at a few passing builds, many have the same logs where they approach 95% memory usage but eventually succeed.
Here are some samples: https://dev.azure.com/dnceng-public/public/_build/results?buildId=836601&view=logs&j=e80acbf0-bc87-577c-4c46-0016b0794913&t=f0fa9d72-e49a-5249-4d28-1199014b9857 https://dev.azure.com/dnceng-public/public/_build/results?buildId=838110&view=logs&j=e80acbf0-bc87-577c-4c46-0016b0794913&t=f0fa9d72-e49a-5249-4d28-1199014b9857&l=3923 https://dev.azure.com/dnceng-public/public/_build/results?buildId=837745&view=logs&j=e80acbf0-bc87-577c-4c46-0016b0794913&t=f0fa9d72-e49a-5249-4d28-1199014b9857&l=4228
Near this point I see logs like this:
initializing ChangeMakerService with capabilities: Baseline, AddMethodToExistingType, AddStaticFieldToExistingType, AddInstanceFieldToExistingType, NewTypeDefinition, ChangeCustomAttributes, UpdateParameters, GenericAddMethodToExistingType, GenericUpdateMethod, GenericAddFieldToExistingType
baseline ready
got a change
parsing patch #1 from /__w/1/s/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField/GenericAddInstanceField_v1.cs and creating delta
Found changes in GenericAddInstanceField.cs
change service made fa564b82-cf1c-4fb0-9d1a-f5ca4c71ff03
wrote /__w/1/s/artifacts/bin/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField/Debug/net10.0/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField.dll.1.dmeta
got a change
parsing patch #2 from /__w/1/s/src/libraries/System.Runtime.Loader/tests/ApplyUpdate/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField/GenericAddInstanceField_v2.cs and creating delta
Found changes in GenericAddInstanceField.cs
change service made fa564b82-cf1c-4fb0-9d1a-f5ca4c71ff03
wrote /__w/1/s/artifacts/bin/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField/Debug/net10.0/System.Reflection.Metadata.ApplyUpdate.Test.GenericAddInstanceField.dll.2.dmeta
done
It looks to me like this is coming from https://github.com/dotnet/hotreload-utils/blob/254ec75de6127c368827d15c3af2477095b8b1b4/src/Microsoft.DotNet.HotReload.Utils.Generator/EnC/ChangeMakerService.cs#L28
Does anyone have an idea why hotreload would be running during a build?? I could imagine that if some hot reload service was runnign during a build or if tests were running while the product was building that could explain high memory usage.
Build
https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=784852
Build leg reported
Build / linux-x64 debug Libraries_AllConfigurations
Pull Request
https://github.com/dotnet/runtime/pull/106599
Known issue core information
Fill out the known issue JSON section by following the step by step documentation on how to create a known issue
@dotnet/dnceng
Release Note Category
Release Note Description
Additional information about the issue reported
No response
Known issue validation
Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=784852 Error message validated:
[restarted. Azure DevOps can't recover from restarts.
] Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 8/26/2024 7:12:18 PM UTCReport
Displaying 100 of 715 results
Summary