conda-incubator / conda-store

Data science environments, for collaboration. ✨
https://conda.store
BSD 3-Clause "New" or "Revised" License
143 stars 46 forks source link

Okay to use `mamba clean --all` on conda-store pod? #434

Closed rsignell-usgs closed 1 year ago

rsignell-usgs commented 1 year ago

Is it okay to shell into the conda-store pod and do: mamba clean --all --yes?

When I try to make modifications to my conda-store environments I'm getting:

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
python: /home/conda/feedstock_root/build_artifacts/libsolv_1649959712616/work/src/rules.c:261: solver_addrule: Assertion `!p2 && d > 0' failed.

and I'm thinking that blowing away the cache might fix things?

viniciusdc commented 1 year ago

Hi @rsignell-usgs, thanks for bringing in this, @costrouc I don't think this would harm the conda-store service, but could you let us know if there is another way around this (cleaning cache) or if the outlined approach is enough?

costrouc commented 1 year ago

@rsignell-usgs is this related to https://github.com/mamba-org/mamba/issues/1382? Sounds like they are saying this is related to mixing defaults and conda-forge? I don't think that mamba clean ... would help

viniciusdc commented 1 year ago

@rsignell-usgs could you share the environment dependencies that you attempted to build?

rsignell-usgs commented 1 year ago

Yep! Here's the env: https://gist.github.com/rsignell-usgs/0b76caba57ed5c43783cd1cb15c808d2

rsignell-usgs commented 1 year ago

As you can see, the env doesn't list the defaults channel. But perhaps the .condarc used by conda-store doesn't have channel_priority: strict? Could that be it?

rsignell-usgs commented 1 year ago

We are having this issue again when building conda-store environments. An environment that normally builds successfully in 12 minutes failed to build after 45 minutes, and the only line in the "full logs" is:

python: /home/conda/feedstock_root/build_artifacts/libsolv_1649959712616/work/src/rules.c:261: solver_addrule: Assertion `!p2 && d > 0' failed.

Note there is no use of defaults channel in the environment.yml file.

@iameskild, @viniciusdc or @costrouc : I think one of you figured out what to do last time, but we apparently didn't record it here, and I can't remember. Did we do mamba clean --all?

Hopefully we can document the solution here so we or others will be able to fix going forward! 👼

dharhas commented 1 year ago

@costrouc

costrouc commented 1 year ago

@rsignell-usgs this makes me think that I need to add logs in the build that include the conda info since I am pretty sure that we are building with strict meaning that I'm surprised that defaults is being used when it is not specified.

dharhas commented 1 year ago

@costrouc @iameskild since this is something that was working and now isn't, wouldn't it mean that either:

@rsignell-usgs can you

  1. make any completely new environment?
  2. make a new environment with these packages.
rsignell-usgs commented 1 year ago

I cleared the cache with mamba clean --all. The conda-store storage is not full. I tried making a completely new environment, same issue.
I'm going to try building locally.

dharhas commented 1 year ago

fyi. That particular environment (from the gist) is stuck on building (24 hrs later) when I try it on demo.nebari.dev as well.

dharhas commented 1 year ago

@costrouc is there a way to see logs for environments as they are actively building?

rsignell-usgs commented 1 year ago

Okay, I tried the environment locally and there are were conflicts that were preventing mamba from building. I removed the troublesome packages and now it builds. (not sure why we didn't hit this conflicts before). So:

On my local computer, I have:

 $ mamba --version
mamba 1.3.1
conda 22.11.1

Conda-store pod (Nebari) has:

mamba 0.25.0
conda 4.14.0
rsignell-usgs commented 1 year ago

We continue to have issues with our ESIP Nebari conda-store.
We had a successful build of an environment this morning, and I thought things were working again (although the build took 34 minutes -- I've never seen an environment take that long to build).

We then made a small change to the same environment (pinned a version of a package), and after 70 minutes it has not yet completed (or failed) -- just says "building".

I logged into the conda-store pod, and top show that mamba is no longer running, but several conda-worker processes are "sleeping":

2023-03-21_11-28-05

What should I do?
Wait?
Kill the worker processes?

rsignell-usgs commented 1 year ago

Update: still says "building", now two hours later. Not sure what to do. Would killing one of the conda-store pods be useful?

rsignell-usgs commented 1 year ago

Okay, turns out this is a GUI/sync issue -- the environment actually completed. When I tried building another environment (151), suddenly 150 said completed.

rsignell-usgs commented 1 year ago

Also, it was recommended to delete old environments using the gui, and while on the gui side it appears to work (we get the greyed out env), I still see the environments when I log into the conda-store pod.

So keep using the "shell into the pod using k9s and delete old environments with rm -rf" method?

Or is this just reflecting that we have a mismatch between the GUI and the pod?

costrouc commented 1 year ago

Closing this since this is related to conda-store managing storage space. There is now a way for conda-store to protect from running out of space.