conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
456 stars 101 forks source link

Synchronize pip indentation with conda env export #573

Closed jamesmyatt closed 6 months ago

jamesmyatt commented 6 months ago

This is a minor change but it synchronizes the indentation of the pip dependencies with the output of conda env export. It makes it easier to compare conda-lock env files with exports of real conda environments.

Indentation in conda env export is determined by the ruamel.yaml configuration in conda.common.serialize: e.g. https://github.com/conda/conda/blob/ff330a1807d34870e60a9b8d3540695587845296/conda/common/serialize.py#L20.

Commands

conda-lock -f environment.yml -p win-64
conda-lock render -k env -p win-64
conda-lock install -n lock-test
conda env export -n lock-test > env-export.yml

environment.yml

name: lock-test
channels:
  - conda-forge
  - nodefaults
dependencies:
  - python =3.12
  - pip:
      - click

conda-win-64.lock.yml

channels:
  - conda-forge
dependencies:
  - bzip2=1.0.8=hcfcfb64_5
  - ca-certificates=2023.11.17=h56e8100_0
  - libexpat=2.5.0=h63175ca_1
  - libffi=3.4.2=h8ffe710_5
  - libsqlite=3.44.2=hcfcfb64_0
  - libzlib=1.2.13=hcfcfb64_5
  - openssl=3.2.0=hcfcfb64_1
  - pip=23.3.2=pyhd8ed1ab_0
  - python=3.12.0=h2628c8c_0_cpython
  - setuptools=68.2.2=pyhd8ed1ab_0
  - tk=8.6.13=h5226925_1
  - tzdata=2023c=h71feb2d_0
  - ucrt=10.0.22621.0=h57928b3_0
  - vc=14.3=hcf57466_18
  - vc14_runtime=14.38.33130=h82b7239_18
  - vs2015_runtime=14.38.33130=hcb4865c_18
  - wheel=0.42.0=pyhd8ed1ab_0
  - xz=5.2.6=h8d14728_0
  - pip:
    - click === 8.1.7 --hash=sha256:ae74fb96c20a0277a1d615f1e4d73c8414f5a98db8b799a7931d1582f3390c28
    - colorama === 0.4.6 --hash=sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6

environment.yml

name: lock-test
channels:
  - conda-forge
dependencies:
  - bzip2=1.0.8=hcfcfb64_5
  - ca-certificates=2023.11.17=h56e8100_0
  - libexpat=2.5.0=h63175ca_1
  - libffi=3.4.2=h8ffe710_5
  - libsqlite=3.44.2=hcfcfb64_0
  - libzlib=1.2.13=hcfcfb64_5
  - openssl=3.2.0=hcfcfb64_1
  - pip=23.3.2=pyhd8ed1ab_0
  - python=3.12.0=h2628c8c_0_cpython
  - setuptools=68.2.2=pyhd8ed1ab_0
  - tk=8.6.13=h5226925_1
  - tzdata=2023c=h71feb2d_0
  - ucrt=10.0.22621.0=h57928b3_0
  - vc=14.3=hcf57466_18
  - vc14_runtime=14.38.33130=h82b7239_18
  - vs2015_runtime=14.38.33130=hcb4865c_18
  - wheel=0.42.0=pyhd8ed1ab_0
  - xz=5.2.6=h8d14728_0
  - pip:
      - click==8.1.7
      - colorama==0.4.6
prefix: C:\Users\jmyatt\.conda\envs\lock-test
netlify[bot] commented 6 months ago

Deploy Preview for conda-lock ready!

Name Link
Latest commit ee0b739557c66a84b4119506721cfaa9de6e1335
Latest deploy log https://app.netlify.com/sites/conda-lock/deploys/65809a19086f9700084f4237
Deploy Preview https://deploy-preview-573--conda-lock.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

maresb commented 6 months ago

In this case it seems to me like Conda is using inconsistent indentation, no? The indentation level under channels: and dependencies: is 2 spaces, but the indentation level under pip: is 4 spaces.

Wouldn't it make more sense to open the dual issue in Conda?

jamesmyatt commented 6 months ago

Thanks @maresb . Conda has used this formatting since always. I can't see them changing it, especially since it's just part of the ruamel.yaml configuration (parser.indent(mapping=2, offset=2, sequence=4)) and there's no custom code for it. So it's 100% consistent with those settings.

In contrast, conda-lock is much newer and uses a custom serializer for environment files, even if it's super simple.

But regardless, for me, one of the main use cases for the conda-lock render -k env option is to be able to compare it with the conda env export output. So I think it makes sense to match the formatting as closely as possible. But minor differences in whitespace are not a dealbreaker.

It's also a (minor) inconvenience if you want to write some code to modify an environment.yml file that you'll end up with whitespace differences between a conda-env-produced one and a conda-lock-produced one.

jamesmyatt commented 6 months ago

Actually, I think the issue is that the pip dependencies actually should be an level 3 indentation, since it's mapping-sequence-mapping-sequence e.g. data["dependencies"][-1]["pip"][n], not a level 2 one for data["dependencies"][-1][n]. I'll experiment with some ruamel.yaml settings to see what might reproduce the conda-lock style.

Update: This isn't quite right. See below.

jamesmyatt commented 6 months ago

The ruamel docs describes yaml.indent(mapping=2, sequence=4, offset=2) as "often seen" and says "It is best to always have sequence >= offset + 2". The "sequence" indentation is counted to the beginning of the sequence element and the "offset" is the indentation of the dash within the space defined by "sequence".

As far as I can tell ruamel.yaml calculates the indentation as follows ("mapping" doesn't seem to have an impact):

name: lock-test
channels:
  - conda-forge
  - nodefaults
dependencies:
  - python =3.12
  - pip:
      - click

-> offset
---> sequence
    -> offset
    ---> sequence

So when you're looking at the indentation of the sequences (via ruamel.yaml), you need to look at the position of the values rather than the -. This makes "conda-lock" style with the conda dependencies indented by 4 spaces and the pip dependencies indented by 2 more, which is inconsistent with how ruamel.yaml calculates indentations. Different parsers like PyYAML may calculate this differently.

You can get the same effect using yaml.indent(mapping=2, offset=2, sequence=2), but only because the mapping for the pip dependencies only has a single value.

maresb commented 6 months ago

Thanks @jamesmyatt for all the research into this issue. This still looks weird to me, but if it's correct in some sense, and if it's more consistent with Conda then let's do it.

Just to confirm before merging, is this PR still correct in the sense of your last comment?

jamesmyatt commented 6 months ago

Yes. I'm happy with this.

If it helps, I think you can to look at it this way:

dependencies:
  - python # <-- this dash is indented 2 spaces compared with the start of "dependencies:"
  - pip:
    - click # <-- this dash is indented 0 spaces compared with the start of "pip:"
maresb commented 6 months ago

Thanks for spelling it out @jamesmyatt, I get it now.