Bioconductor / basilisk

Clone of the Bioconductor repository for the basilisk package.
https://bioconductor.org/packages/devel/bioc/html/basilisk.html
GNU General Public License v3.0
27 stars 14 forks source link

feature request: mamba #42

Open nick-youngblut opened 2 months ago

nick-youngblut commented 2 months ago

mamba is substantially faster than conda, and moreover, a micromamba install is smaller and faster to install versus miniconda.

As an example, I'm using the FLAMES bioconductor package, which is using basilisk to create a conda env:

conda create --yes --prefix flames_env 'python=3.10' --quiet -c conda-forge -c bioconda -c defaults

conda install --yes --prefix flames_env 'python=3.10' -c conda-forge -c bioconda -c defaults

conda install --yes --prefix flames_env -c conda-forge -c bioconda -c defaults 'python=3.10' 'python=3.10' 'numpy=1.25.0' 'scipy=1.11.1' 'pysam=0.21.0' 'cutadapt=4.4' 'tqdm=4.64.1' 'pandas=1.3.5'

The preceding was modified for clarity.

The creation of the conda env and install of the python packages took >15 minutes; whereas with mamba, the install would take ~1-2 minutes.

LTLA commented 2 months ago

Check out the discussion at Bioconductor/basilisk.utils#11; from what I understand, the miniforge setup uses mamba under the hood. The idea is to have this become the default in the next release, assuming that the inevitable struggles on Windows can be resolved.

nick-youngblut commented 2 months ago

The idea is to have this become the default in the next release

That's great! Thanks for the info.

inevitable struggles on Windows can be resolved

Does anyone actually do bioinformatics on Windows machines? ...especially when running Linux VMs on a cloud service is so easy and cheap? There's also WSL, so why use Windows?

LTLA commented 2 months ago

Does anyone actually do bioinformatics on Windows machines? ...especially when running Linux VMs on a cloud service is so easy and cheap? There's also WSL, so why use Windows?

Apparently 50% of BioC package downloads are for the Windows binaries, as of a few years ago - @vjcitn could give more recent statistics. So, not a negligible proportion of users. I'd guess they're probably bench scientists just trying to get something done with whatever machine they were given; it would be a pretty high barrier to entering bioinformatics if they needed a cloud account (I don't even have one) or set up WSL (I haven't) to get started.

vjcitn commented 2 months ago

@nick-youngblut Note that you can get faster installation on non-windows machines now if you set the environment variable

BASILISK_MINICONDA_VERSION=py311_23.11.0-2

see section 3.2 of the basilisk vignette. That version of miniconda actually uses mamba resolution IIUC.

vjcitn commented 2 months ago

To take advantage of this you might have to remove content in the cache at basilisk.utils::getExternalDir()

nick-youngblut commented 2 months ago

Thanks @vjcitn for the advice!

nick-youngblut commented 1 month ago

To take advantage of this you might have to remove content in the cache at basilisk.utils::getExternalDir()

Yeah, both setting BASILISK_MINICONDA_VERSION and checking basilisk.utils::getExternalDir() is definitely a bit of a burden for users.