Open JettJones opened 7 months ago
Thanks for the repro! That will be very helpful in profiling this.
I wonder if a lack of structure sharing in the Go build graph (BuildGoPackageRequest
) contributes to the issue. If the buld graph for the standard library is not being reused, that could contribute greatly to the issue.
That seems fairly likely. Tangentially similar: https://github.com/pantsbuild/pants/pull/20030/
Hmm, so looking at this, I don't actually see redundancy. Those 6470 AnalyzeThirdPartyPackageRequest objects correspond to 6470 distinct go packages. The trouble may be that this many pending Process invocations (with somewhat large inputs) is more than the engine can handle with reasonable memory consumption.
Since individual packages within a third-party module can't change given a module version, maybe we need to switch to one-process-per-module instead of one-process-per-package? Those 6470 packages belong to just 385 modules.
@tdyas thoughts?
Note that I incidentally noticed and fixed https://github.com/pantsbuild/pants/pull/20332 while looking into this, but it is not the cause of this issue.
Since individual packages within a third-party module can't change given a module version, maybe we need to switch to one-process-per-module instead of one-process-per-package? Those 6470 packages belong to just 385 modules.
That would be a reasonable way to batch the work.
Any update on this? Have just tried adding Pants as our build tool for our project since last week. Generally, it fits quite well with support of Python, Go, Docker and Helm as that is our pipeline as well.
However, high memory usage and constant redownloading of Go modules affects the user experience.
I believe @JettJones is working on this?
I believe @JettJones is working on this?
I tried a version of downloading go modules into named_cache - and while it solved the memory problems, that structure broke assumptions around cache lifetime and didn't have a path forward for remote execution. So it's largely back to the drawing board, and I don't have an active plan at the moment.
@tdyas sounds like we really need to solve this. Any ideas?
Fwiw as far as I can tell almost no work is actually happening when rerunning on the above repo, 99% of what's happening is just pulling from cache... So what we're really seeing is that if you you blast several thousand requests towards lmdb it falls over?
Edit: Or after sleeping on it... That our process scheduler is inefficient when no processes have to actually be scheduled. Seems like a more likely fact in retrospect. Maybe we can query the local cache before process scheduling instead.
Running into this too. For context, I'm using pants for golang project introspection - determining what dependent files have changed. Ideally (while this bug exists) I'd like to skip go modules downloads entirely, and just say "everything has changed" when go.mod/go.sum changes -> downloading go modules with pants takes a long time and causes OOM issues
Describe the bug Building our golang projects uses a lot of memory - leading to pantsd exits / OOM kills. Because of this, we get slow build and rebuilds due to a loss of caching, or other crashes from OOM.
Pants version 2.18.1
OS My setup is on Linux (Ubuntu 22.04)
Additional info Likely similar to #19053 from earlier pants releases.
I uploaded a simplified reproduction here: https://github.com/JettJones/go-include
Here is the memory summary of
pants --stats-memory-summary check ::
& debug logs gist