zuzukin / whl2conda

Generate conda package from pure python wheel
https://zuzukin.github.io/whl2conda/
Apache License 2.0
6 stars 2 forks source link

Support wheel .data directory #91

Closed analog-cbarber closed 10 months ago

analog-cbarber commented 1 year ago

See https://packaging.python.org/en/latest/specifications/binary-distribution-format/#the-data-directory

We do not do anything about this now.

Haven't seen this in the wild yet. Not sure what conda-build does.

analog-cbarber commented 1 year ago

At least record limitation

jjhelmus commented 1 year ago

Jupyterlab is an example of a universal wheel which includes .data directories. Specifically .data/data/share and .data/etc directories. These are copied into the root of the conda package

analog-cbarber commented 1 year ago

Thanks for the reference.

analog-cbarber commented 10 months ago

So looking at the jupyterlab example there is a 'data' directory that is not itself copied. The spec says nothing about there being a 'data/' subdirectory or explain how the subcomponents get mapped to their paths. It just says

The .data directory contains subdirectories with the scripts, headers, documentation and so forth from the distribution. During | installation the contents of these subdirectories are moved onto their destination paths.

It appears that doing a pip install does copy the etc/ and share/ files to the corresponding directories in the environment prefix.

Copying the files is straightforward enough, but would like to get some clarity on whether there is always going to be a data/ dir inside the <package>.data/ directory and if not, what that means.

analog-cbarber commented 10 months ago

It really is not documented in the wheel spec, but it appears that the .data dir should contain one or more subdirs with one of the recognized path names. These are described in the sysconfig module doc: https://docs.python.org/3/library/sysconfig.html. Each path can be looked up using sysconfig.get_paths().

>>> print(json.dumps(sysconfig.get_paths(), indent=2))
{
  "stdlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11",
  "platstdlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11",
  "purelib": "/Users/Christopher.Barber/miniconda3/lib/python3.11/site-packages",
  "platlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11/site-packages",
  "include": "/Users/Christopher.Barber/miniconda3/include/python3.11",
  "platinclude": "/Users/Christopher.Barber/miniconda3/include/python3.11",
  "scripts": "/Users/Christopher.Barber/miniconda3/bin",
  "data": "/Users/Christopher.Barber/miniconda3"
}

So the data key corresponds to the prefix root (or the root of the conda package) and the script key corresponds to the bin directory (at least on Mac/Linux). The others are locations inside of the python install on Mac/Linux. On Windows you get:

>>> print(json.dumps(sysconfig.get_paths("nt"), indent=2))
{
  "stdlib": "/Users/Christopher.Barber/miniconda3/Lib",
  "platstdlib": "/Users/Christopher.Barber/miniconda3/Lib",
  "purelib": "/Users/Christopher.Barber/miniconda3/Lib/site-packages",
  "platlib": "/Users/Christopher.Barber/miniconda3/Lib/site-packages",
  "include": "/Users/Christopher.Barber/miniconda3/Include",
  "platinclude": "/Users/Christopher.Barber/miniconda3/Include",
  "scripts": "/Users/Christopher.Barber/miniconda3/Scripts",
  "data": "/Users/Christopher.Barber/miniconda3"
}

So we can only support the data key in nonarch packages (unless we want to copy to both bin and Scripts?)

So far I have only found wheels using the data key since most script installs are handled using entry points.

analog-cbarber commented 10 months ago

So I think the plan should be to process any *.data directories (there really will only be one, but pip install seems to accept multiple ones), and move any files under data/ to the root of the conda package and issue warnings or errors if there is anything else there.

analog-cbarber commented 10 months ago

I think that the scripts directory probably maps to python-scripts in the conda package.

analog-cbarber commented 10 months ago

Fixed in 24.1.1