Open ptheywood opened 3 years ago
Conda packages do not have an equivanelt to the extras
/extras_require
part of pypi/wheel packages which are being considered for optional dependencies. There are open issues on the conda repo to add this feature, but the general consensus so far is to use conda meta packages.
A conda metapackage is a package which has no files, only metadata (i.e. depends on packages x and y). I.e. we could have the core pyflamegpu
pacakge, and a pyflamegpu-visualisation
metapackage which depends on the core pyflamegpu
and the visualisation compontent? I don't fully understand this yet, but it's worth making a note of.
[a-z0-9_\.\-]
.
pyflamegpu
and pyflamegpu-vis
or pyflamegpu_vis
would be ok. if we have more than one.<package_name>-<version>-<build>.(tar.bz2|conda)
.
.conda
format is much faster to extract, and typically is smaller than the older .tar.bz2
pacakge format. linux-64/pytorch-1.9.0-py3.8_cuda11.1_cudnn8.0.5_0.tar.bz2
conda install pytorch cudatoolkit=11.1 -c pytorch -c nvidia
.
win-64/pytorch-1.9.0-py3.8_cuda10.2_cudnn7_0.tar.bz2
conda install pytorch cudatoolkit=10.2 -c pytorch
from windowswin-64/pytorch-1.9.0-py3.6_cpu_0.tar.bz2
conda install pytorch cpuonly -c pytorch
from a wnidows platform.
cpuonly
here is a metapackage that influences which version of pytorch is installed. The track_features
component of metadata.yaml
looks relevant.conda install pytorch cudatoolkit=10.2 -c pytorch
will install the pytorch
package from the pytorch
channel, and depends on version 10.2
of the cudatoolkit package (which is provided by the nvidia
channel.
flamegpu
or pyflamegpu
channels. Probably flamegpu
in case we provide bindings for R which could be distributed through conda.linux-64
, win-64
and osx-64
. noarch
is used for platform independent (not-binary) pacakges.
The default label
is main
Labels can be used to separate builds at different stages of development
When packages are uploaded to an index, they can be assigned one or more labels.
When users install packages from an index, they will only access the main
(or only?) label by default.
If there were a test
lablel users would use -c channel/label/test
when searching / installing rather than -c channel
.
This might be how to distribute alpha
/beta
/rc
packages with conda, possibly a prerelease
label for all of them, although I'm unsure how the conda's version comparison rules would decide which is latest if an exact version was not provided.
rc
label for beta
and rc
releases, used when there are no new features to be added to that release, just bug-fixes.dev
label for pre-alpha
and alpha
builds, i.e. for versions that will have significant changes prior to a stable (or beta/rc) release. -c <channel>
options. I dont' expect that we will be submitting to conda-forge, at least not initially. Alternatively, pytorch use a separate channel for nightly builds, but this doesn't fit our current needs.
These are just some key points while scanning the referecned source. Not comprehensive.
source
section needs to refer to the git_rev
and /or include hashes of a tarball download. This might mean that we have to geenrate this in a post-release step (i.e. we can't include this information in a file in the repository). It's a bit chicken-and-egg-ish. Though performing the builds as part of the on tag push CI would have this information avaialble (assuming it works so we don't have to force push over it). Probably fine, just needs some thought. track_features
is a way of de-prioritising packages when there are multiple compeating version, and no two pacakges in a subdirecotry should have the same track_feature. I.e. this is how the CPU version of pytorch is de-prioritised compared to the gpu variants (if the cuda-toolkit dependency is met?) build: no_link:
is used to enforce copying not linking of files. build: noarch
is used for architecture independent packages. This would only be relevant for metapackages if we use them.requirements
of packages are split into build
, host
and run
requirements, with complex rules about how a build
requirement may implicitly add host
and run
requirements.
git
and cmake
would be some of the entries in build:
in our case. host
section rather than build
, for portable packages? test:
section provides details of how ot test the package, including dependencies, commands and python imports required. Alternatively a script can be referenced to handle this. version: {{ GIT_DESCRIBE_TAG }}
cudf
package which is the python interface, which depends on libcudf
(plus alternate builds with optional extras such as kafkaconda/
└── recipes
├── libflamegpu
│ ├── bld.bat
│ ├── build.sh
│ └── meta.yaml
└── pyflamegpu
├── bld.bat
├── build.sh
└── meta.yaml
And maybe visualisation variants of the above, and/or console versions too, depending on how splitting that goes.
The pyflamegpu package metadata would depend on libflamepgu metadata (if we package both on conda). Alternatively we could just provide pyflamegpu if we do not dist libflamegpu separately.
Conda is looking more viable / likely to be our chosen method of distribution, as there are GL related conda packages we might be able to leverage to avoid redistibuting ourselves, if that's even required for conda binary distribution.
Pip/pypi is looking less and less viable (for visuaslisation distribution)
If we are to make a conda package, we will need to tweak the dll loading logic for windows + python 3.8+ when it is patched to use add_dll_directory
when finding nvrtc etc, as conda doesn't do the same as cpython.
The Conda forge maintainers documentation has some very useful info about making packages. I doubt we'll push to conda-forge immediately, instead using our own channel, but most of this still applies and doesn't look too bad.
CUDA 12.x needs handling differently to 11.x it seems, in a good way (can specify the parts of cuda needed, so installs are much lighter).
https://github.com/conda-forge/cuda-version-feedstock/blob/main/recipe/README.md
Rapids/pytorch are probably a good source of how to deal with that.
If we are to make a conda package, we will need to tweak the dll loading logic for windows + python 3.8+ when it is patched to use
add_dll_directory
when finding nvrtc etc, as conda doesn't do the same as cpython.
Best evidence I can find, is that we wouldn't require that data.
Conda is an alternative python package distribution mechanism to pip/pypi.
It is not as widely use due to
pip
and pypi being the defacto defaults, but is widely used in the scientific community.The main advantage of conda over pip is that it can manage non python dependencies, such as the cuda toolkit.
Several large/populat CUDA + python packages prefer conda as their package distribution mechanism to some of the features it offers comapred to pypi, including rapids and torch.
Conda allows packages to depend on nvidia provided
cudatoolkit
packages, with versions specified, unlike pypi, allowing users to requrest the latest version of a package, built with CUDA X.Y which is not possible using pypi + local verison +cudaXY labels.The conda EULA however should also be considered, as this was changed in 2020? to be less favourable than before, but in practice this shouldn't be a issue.
I'm not yet famililar enough with the conda packaging process to weigh up on how much effort this would be, without cmake/swig based setup, as most examples are from a mostly pure python packaging, but things to consider would be:
conda convert
command may be relevant.libcuda.so
, so (atleast some of) #647 will still be required.See #605 for some past notes/discussion