conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
467 stars 102 forks source link

Simplify reconstruction of fetch actions #376

Closed maresb closed 1 year ago

maresb commented 1 year ago

I'm very interested to see if this has any affect on #338.

netlify[bot] commented 1 year ago

Deploy Preview for conda-lock ready!

Name Link
Latest commit b39bea4b554b318f777dcb93a6ec9ff7e7646ca5
Latest deploy log https://app.netlify.com/sites/conda-lock/deploys/6400b76083979a0008d8c0fa
Deploy Preview https://deploy-preview-376--conda-lock.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

maresb commented 1 year ago

pre-commit.ci autofix

mariusvniekerk commented 1 year ago

Seems like this does improve the md5 test issue?

maresb commented 1 year ago

:point_up: squash

@mariusvniekerk, yes, this resolves it. In the process I noticed a whole bunch of illogical stuff regarding the formation of LockedDependencies which I wanted to clean up, but it goes deeper than I originally thought. I think it makes sense to merge this so that CI works, and follow up with the remaining stuff in future PRs.

:point_down: rebase

maresb commented 1 year ago

As a brief explanation, my prevailing theory (based on substantial evidence) is that when Micromamba computes the dry-run LINK info, it tries to grab the hashes from the repodata_record.json with an internal implementation similar to #373, but without the retry. Thus in the rare case when repodata_record.json has not been extracted completely, Micromamba leaves the hashes blank.

This PR abandons the approach of filling in missing Micromamba FETCH actions with their corresponding LINK actions, instead using the more robust repodata_record.json reader.

maresb commented 1 year ago

@mariusvniekerk, all tests are green, so if you approve, then we can merge this, fixing most of the CI issues.

As a follow-up to this, we should make sure that the approaches in update_specs_for_arch and solve_specs_for_arch are as similar as possible, and eliminate duplicated code. Moreover, since the data structures for LinkAction are different between Micromamba and Conda/Mamba, we should redefine LinkAction to Union[MicromambaLinkAction, CondaLinkAction] with the respective models. This lights up the type checker, revealing some substantial issues, which is why I think it deserves its own PR.

mariusvniekerk commented 1 year ago

cc @wolfv there is probably some stuff in this discussion that you should be aware of

wolfv commented 1 year ago

@maresb can you share a bit more of the symptoms? What I understand is that the repodata_record.json contains the correct complete values for MD5 and SHA256, but micromamba is somehow not reading them?

maresb commented 1 year ago

Ya, I will try to open a proper issue soon in mamba-org/mamba with more details. It relates to some apparent race condition that occurs on a regular basis on our test suite but is not reliably reproducible.

We have multiple parallelized tests spawning Conda/mamba/micromamba, but I am not sure whether or not this is a multiprocess problem.

The problem seems to occur when we do a dry-run solve to get LINK and FETCH action JSON data, and the md5 and sha256 fields of a link-only package are missing from this output. When this triggers, I immediately look (programmatically) at the installation location of the package in question, and it looks like extraction is incomplete. It completes after sleeping for 100 ms, and if I then repeat the dry run it is successful.