Closed analog-cbarber closed 10 months ago
At least record limitation
Jupyterlab is an example of a universal wheel which includes .data directories. Specifically
Thanks for the reference.
So looking at the jupyterlab example there is a 'data' directory that is not itself copied. The spec says nothing about there being a 'data/' subdirectory or explain how the subcomponents get mapped to their paths. It just says
The .data directory contains subdirectories with the scripts, headers, documentation and so forth from the distribution. During | installation the contents of these subdirectories are moved onto their destination paths.
It appears that doing a pip install does copy the etc/ and share/ files to the corresponding directories in the environment prefix.
Copying the files is straightforward enough, but would like to get some clarity on whether there is always going to be a data/
dir inside the <package>.data/
directory and if not, what that means.
It really is not documented in the wheel spec, but it appears that the .data
dir should contain one or more subdirs with one of the recognized path names. These are described in the sysconfig module doc: https://docs.python.org/3/library/sysconfig.html.
Each path can be looked up using sysconfig.get_paths()
.
>>> print(json.dumps(sysconfig.get_paths(), indent=2))
{
"stdlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11",
"platstdlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11",
"purelib": "/Users/Christopher.Barber/miniconda3/lib/python3.11/site-packages",
"platlib": "/Users/Christopher.Barber/miniconda3/lib/python3.11/site-packages",
"include": "/Users/Christopher.Barber/miniconda3/include/python3.11",
"platinclude": "/Users/Christopher.Barber/miniconda3/include/python3.11",
"scripts": "/Users/Christopher.Barber/miniconda3/bin",
"data": "/Users/Christopher.Barber/miniconda3"
}
So the data
key corresponds to the prefix root (or the root of the conda package) and the script
key corresponds to the bin
directory (at least on Mac/Linux). The others are locations inside of the python install on Mac/Linux. On Windows you get:
>>> print(json.dumps(sysconfig.get_paths("nt"), indent=2))
{
"stdlib": "/Users/Christopher.Barber/miniconda3/Lib",
"platstdlib": "/Users/Christopher.Barber/miniconda3/Lib",
"purelib": "/Users/Christopher.Barber/miniconda3/Lib/site-packages",
"platlib": "/Users/Christopher.Barber/miniconda3/Lib/site-packages",
"include": "/Users/Christopher.Barber/miniconda3/Include",
"platinclude": "/Users/Christopher.Barber/miniconda3/Include",
"scripts": "/Users/Christopher.Barber/miniconda3/Scripts",
"data": "/Users/Christopher.Barber/miniconda3"
}
So we can only support the data
key in nonarch packages (unless we want to copy to both bin and Scripts?)
So far I have only found wheels using the data
key since most script installs are handled using entry points.
So I think the plan should be to process any *.data
directories (there really will only be one, but pip install seems to accept multiple ones), and move any files under data/
to the root of the conda package and issue warnings or errors if there is anything else there.
I think that the scripts
directory probably maps to python-scripts
in the conda package.
Fixed in 24.1.1
See https://packaging.python.org/en/latest/specifications/binary-distribution-format/#the-data-directory
We do not do anything about this now.
Haven't seen this in the wild yet. Not sure what conda-build does.