Closed Hind-M closed 2 weeks ago
Used micromamba
version: 2.0.0rc0
I cannot reproduce the errors which you report using conda-forge/label/micromamba_dev/linux-64/micromamba-2.0.0{rc0-1,rc1-2}
.
On my machine, installing those packages take around 1.5GiB of memory storage in the $CONDA_PREFIX
, while using less than 1GiB of RAM.
@ndevenish: Could you provide the difference of your instances' resource usage when using micromamba<2.0.0rc0
and micromamba>=2.0.0rc0
?
When this ticket was made it was a while since I had seen it happen to people.
Now 2.0.0 is out I am seeing this happen on CI
This is exactly 700 GB btw
RHEL8, 16GB memory machine:
curl -JLO https://raw.githubusercontent.com/dials/dials/refs/heads/main/.conda-envs/linux.txt
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
psrecord --plot out.png "bin/micromamba create -yp ENV/ -c conda-forge --file linux.txt"
% bin/micromamba --version
2.0.0
Possibly because it seems to be in a package-cache-fetching loop? https://github.com/user-attachments/assets/cf71deec-db90-4735-93b1-b8e6365f2fe7
The repodata.json
is reparsed for each package (since conda-forge::
is specified for everyone of them), causing major resource usage.
This is a regression of micromamba 2.0.0.
From bisecting, e874e7ea71ceefa1f52bdfd8deb6bf5bb3129316 from https://github.com/mamba-org/mamba/pull/2986 is the culprit.
Ah, excellent detective work. Removing the conda-forge::
prefix sounds like it should give us a way to solve the problem before a more widespread fix. From recollection, we started doing that in order to prevent pulling in from other places, but I think the only way that we generate installations now avoids that completely, so it shouldn't be needed any more.
Yes, we must only parse the subdirectory once.
jjerphan:mamba:fix/parsing-subdir
is a WIP branch to resolve this issue, it is currently blocked by https://github.com/jbeder/yaml-cpp/issues/1322.
Actually, the channel duplication is not the only cause: most of the runtime after its correction is also due to a costly quick sort execution in libsolv
's solver_unifyrules
.
Using samply
:
samply record $HOME/dev/mamba/build/micromamba/micromamba create -yp /tmp/5ENV/ -c conda-forge --file /tmp/linux.txt
With the conda-forge::
prefix:
Without the conda-forge::
prefix:
I guess this might be due to comparison function for package solvable when the resolution is run.
Bisecting indicates that the regression has been first introduced by e874e7ea71ceefa1f52bdfd8deb6bf5bb3129316, the merge commit of #2986.
From @ndevenish in the QS lobby on gitter: " Is there any known issues with current
micromamba
about resource usage, possibly related to Centos/RHEL? I've had two separate people come to me this week with issues with: a) usingmicromamba
in a container build dying because it filled their entire temp disk (when installing very few packages). b) being what looked like OOM killed after taking >60% of their memory. Both tasks which have worked before.The out-of-disk-space instance was running:
micromamba create -y -c conda-forge gnuplot python numpy pymca workflows>=1.7 xraylib zocalo
and it took at least 4GB of scratch disk space (the smallest of possible locations that podman was using to do container working on their system).The other instance didn't get past resolving (an admittedly rather large requirement) but was using >9GB of ram on a 16GB machine the last time I checked before it died. "