Not sure if/how/what we'll formalize into CEP(s). But I'm starting to think about how we could use conda to deliver large data sets and other "non-code" binary blobs. Think reference genomes for bioconda packages, corpora for nltk, pre-trained models for ${your favorite new gen AI package}, etc.
We can currently do things like just generating "huge" (multi-GB) packages or (ab)using post-link/activation scripts to run wget some-suspect-url, but I'm wondering if, as a community, we can come up with more clever solutions.
Not sure if/how/what we'll formalize into CEP(s). But I'm starting to think about how we could use
conda
to deliver large data sets and other "non-code" binary blobs. Think reference genomes forbioconda
packages, corpora fornltk
, pre-trained models for${your favorite new gen AI package}
, etc.We can currently do things like just generating "huge" (multi-GB) packages or (ab)using post-link/activation scripts to run
wget some-suspect-url
, but I'm wondering if, as a community, we can come up with more clever solutions.