Open pconrad-insitro opened 2 years ago
Same for me. Short simple yaml when it fails example:
name: ocr_ws_prod
channels:
- defaults
dependencies:
- python>=3.7,<3.8
- pip:
- Django==1.11.29
Error:
(base) root@container:~/repo# conda-lock -f env_conda_prod.yml -p linux-64
Locking dependencies for ['linux-64']...
INFO:conda_lock.conda_solver:linux-64 using specs ['python >=3.7,<3.8', 'pip *']
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/conda_lock/src_parser/__init__.py", line 282, in seperator_munge_get
return d[key]
KeyError: 'Django'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/conda_lock/src_parser/__init__.py", line 285, in seperator_munge_get
return d[key.replace("-", "_")]
KeyError: 'Django'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/bin/conda-lock", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/conda_lock/conda_lock.py", line 1167, in lock
filename_template=filename_template, check_input_hash=check_input_hash
File "/opt/conda/lib/python3.7/site-packages/conda_lock/conda_lock.py", line 949, in run_lock
filter_categories=filter_categories,
File "/opt/conda/lib/python3.7/site-packages/conda_lock/conda_lock.py", line 393, in make_lock_files
update_spec=update_spec,
File "/opt/conda/lib/python3.7/site-packages/conda_lock/conda_lock.py", line 727, in create_lockfile_from_spec
update_spec=update_spec,
File "/opt/conda/lib/python3.7/site-packages/conda_lock/conda_lock.py", line 696, in _solve_for_arch
platform=platform,
File "/opt/conda/lib/python3.7/site-packages/conda_lock/pypi_solver.py", line 300, in solve_pypi
src_parser._apply_categories(requested=pip_specs, planned=planned)
File "/opt/conda/lib/python3.7/site-packages/conda_lock/src_parser/__init__.py", line 296, in _apply_categories
for dep in seperator_munge_get(planned, item).dependencies
File "/opt/conda/lib/python3.7/site-packages/conda_lock/src_parser/__init__.py", line 287, in seperator_munge_get
return d[key.replace("_", "-")]
KeyError: 'Django'
conda version: 4.12.0 conda-lock version: 1.0.4 run from inside ubuntu based docker container
@pconrad-insitro & @wosiu I found using conda-forge
exclusively helps me to avoid these issues. I spent a good while figuring it out and wrote up this walkthrough: https://github.com/jesshart/code-tutorials/blob/main/python/dependency-management/README.md
I actually stopped using pip and found conda-lock
does what I need it to. I hope this is helpful.
Note: This is not possible for all projects.
@jesshart - unfortunately, I can't switch to a pure conda-forge
solution. However, I now realize I didn't really explain that.
We mostly use conda dependencies, but have a few that are not available (for various reasons), and hence fall back to pip dependencies. I was excited at the potential to jointly solve all the dependencies, made possible by the new features. Sadly, I hit the bug I showed.
The reproduction example I gave is contrived, since seaborn
is indeed available on conda-forge
. Our actual problem is
is very similar, though. It is likewise caused by a pip dependency with a transitive dependency on matplotlib
.
@wosiu - thanks for the second example, that is even simpler!
I can't go with conda-forge
only as suggested by @jesshart, because some of the package's versions are not available in conda-forge, whereas they are available via pip.
@pconrad-insitro @wosiu while you wait on pip support in conda-lock
, I found this might be useful (while definitely extra work):
Since you know what your direct dependencies are for pip
and conda
, you can make use of conda-lock
for your conda packages and pip-compile
for your pip packages.
django
from your environment.yml
(and the whole section of pip) and put your direct pip dependencies in a requirements.in
file like so:
# requirements.in
django
conda-lock
as you likepip-compile --generate-hashes requirements.in > requirements.txt
Now you can start your conda environment from the conda-lock
file and then run pip install -r requirements.txt
to get the environment you desire with all packages and pinned dependencies.
I am seeing a weird behaviour on my machine, with a very similar environment as the original post, sometimes it works sometimes I get the same KeyError: 'matplotlib-base'
error. I tried both conda-lock 1.0.5 and main
, they show the same behaviours.
I know there was some work in https://github.com/conda-incubator/conda-lock/pull/157 to better support conda/pip interplay. There are work-arounds for this kind of issue in general, but I would be interested to get @mariusvniekerk insights about whether there is a chance tricky issues at the conda/pip boundaries may be handled one day.
I dug more into the issue, there are more details below if that can help.
To reproduce (very similar environment to the original post, with seaborn installled through conda and seaborn through pip)
cat << EOF > /tmp/test-environment.yml
channels:
- conda-forge
dependencies:
- matplotlib
- pip:
- seaborn # Depends on matplotlib
EOF
conda-lock lock -p linux-64 -f /tmp/test-environment.yml --lockfile /tmp/conda-lock.yml
A standalone Python script (mostly taken from conda_lock.conda_lock._solve_for_arch
) that reproduces the issue:
from pathlib import Path
from conda_lock.conda_solver import solve_conda
from conda_lock.pypi_solver import solve_pypi
from conda_lock.src_parser import (
VersionedDependency,
Selectors
)
from conda_lock.models.channel import Channel
# You probably need to adapt the `conda` path and `platform`
conda = Path("~/miniconda3/condabin/mamba").expanduser()
platform = 'linux-64'
# there is an additional channel
# Channel(url='file:///tmp/tmpbst6eigh'), hopefully this does not change the
# logic too much
channels = [Channel.from_string('conda-forge')]
requested_deps_by_name = {
"conda": {
"matplotlib": VersionedDependency(
name="matplotlib",
manager="conda",
optional=False,
category="main",
extras=[],
selectors=Selectors(platform=None),
version="",
build=None,
),
"pip": VersionedDependency(
name="pip",
manager="conda",
optional=False,
category="main",
extras=[],
selectors=Selectors(platform=None),element
version="*",
build=None,
),
},
"pip": {
"seaborn": VersionedDependency(
name="seaborn",
manager="pip",
optional=False,
category="main",
extras=[],
selectors=Selectors(platform=None),
version="*",
build=None,
)
},
}
locked_deps_by_name = {'conda': {}, 'pip': {}}
conda_deps = solve_conda(
conda,
specs=requested_deps_by_name["conda"],
locked=locked_deps_by_name["conda"],
update=[],
platform=platform,
channels=channels
)
conda_deps_keys = list(conda_deps.keys())
matplotlib_base_index = conda_deps_keys.index('matplotlib-base')
matplotlib_index = conda_deps_keys.index('matplotlib')
print(f"{matplotlib_base_index=}")
print(f"{matplotlib_index=}")
if matplotlib_base_index < matplotlib_index:
print('Oh oh matplotlib-base first problems ahead')
else:
print('matplotlib-base last lucky you')
pip_deps = solve_pypi(
requested_deps_by_name["pip"],
use_latest=[],
pip_locked={},
conda_locked={dep.name: dep for dep in conda_deps.values()},
python_version=conda_deps["python"].version,
platform=platform,
)
conda_deps
dict order (from solve_conda
) is not consistent between callsmatplotlib
and matplotlib-base
maps to the same PyPI name (matplotlib
), so whichever comes last will win.matplotlib
is last, then planned['matplotlib']
will look like this
LockedDependency(name='matplotlib' version='3.5.2', dependencies={'matplotlib-base': '>=3.5.2,<3.5.3.0a0', ...}, ...}
and it will fail because at one point, because in this line https://github.com/conda-incubator/conda-lock/blob/dbce9be9d9b5114ffceec16e7527bd9360dce187/conda_lock/src_parser/__init__.py#L296 dependencies are walked over and at one point item='matplotlib-base
but matplotlib-base
is not in the planned
keys
matplotlib-base
is last, then planned['matplotlib']
will look like this
LockedDependency(name='matplotlib-base', version='3.5.2', dependencies={'certifi': '>=2020.06.20', 'cycler': '>=0.10', 'fonttools': '>=4.22.0', 'freetype': '>=2.10.4,<3.0a0', 'kiwisolver': '>=1.0.1', 'libgcc-ng': '>=10.3.0', 'libstdcxx-ng': '>=10.3.0', 'numpy': '>=1.21.6,<2.0a0', 'packaging': '>=20.0', 'pillow': '>=6.2.0', 'pyparsing': '>=2.2.1', 'python': '>=3.10,<3.11.0a0', 'python-dateutil': '>=2.7', 'python_abi': '3.10.* *_cp310', 'tk': '>=8.6.12,<8.7.0a0'}, ...)
and matplotlib-base
is not in the values associated to the dependencies
key so everything works fine
Thanks for the writeup @lesteve, I ran into this package as well as this issue today (while on my yearly attempts to find a way to reliably maintain large mixed conda/pip environments :smiling_face_with_tear:) as well. Your explanation saved me some time!
I've got a few ideas for how to handle this and have some (semi) working prototypes, @mariusvniekerk do you know if anybody's working on this already or could I take it up?
Is there any progress with this problem?
Also ran into this today - would be great if that would be fixed. Happy to help out aswell :-)
Also got the same issue here with a single dependence (both depending python
and pandas
depending on tzdata
).
name: test
channels:
- conda-forge
dependencies:
- python=3.9
- pip:
- pandas
This may also be closed by #290, thanks for the pointer @lesteve!
I believe indeed that it has been fixed by #290 and this issue can be closed. I tried conda-lock from main
with the environment mentioned in the comments and they all work fine.
Thank you for releasing v1.0, and congratulations on the progress!
I believe I have found a corner case in the joint conda/poetry solver, having to do with package renaming. This is a very useful capability, and I'm not surprised it is subtle.
Consider this example yaml:
The apparent problem is that conda knows about both
matplotlib
andmatplotlib-base
, but pip only knows aboutmatplotlib
. Somewhere in the conversions between the two systems, it's getting confused and checking for the conda name in the pip list.Comment out the matplotlib line in the spec above and it works, as the solution will be entirely pip. As is, it will fail on v1.0.3 (on an intel mac):
I lightly redacted the paths, but the stack trace is hopefully clear.
Any thoughts? Let me know if I can assist in debugging. I traced the code for a while. I suspect the solution is another careful application of the forward/reverse naming mapping, but I am not sure what change is best.