kalininalab / DataSAIL

DataSAIL is a tool to split datasets while reducing information leakage.
https://datasail.readthedocs.io
MIT License
18 stars 1 forks source link

Trouble installing using Mamba #16

Closed cwinsnes closed 8 months ago

cwinsnes commented 10 months ago

Issue description

I am having trouble installing DataSAIL through mamba. It is unable to find the package datasail because it "does not exist".

I run the command

mamba create -n sail -c conda-forge -c kalininalab -c bioconda DataSAIL

Which yields the following

[... output regarding channels ...]
Could not solve for environment specs
The following package could not be installed
└─ datasail does not exist (perhaps a typo or a missing channel).

The problem persists whether or not I run the create command with the additional -c mosek or not.

System specifications

mamba 1.5.5
conda 23.11.0
python 3.10.13

Running on a Debian Bullseye (Debian 11.8) Docker image, with a MacOS Sonoma (M2 chip) as the underlying system.

It is not possible to install the package on the MacOS system directly due to the package cd-hit being unavailable on such a system.

Old-Shatterhand commented 9 months ago

Hey @cwinsnes,

please excuse the late response. The problem is that all clustering algorithms (mmseqs, cd-hit, foldseek, and mash) only have builds for osx and linux-64. Which is funny, because to my knowlegde, they can be installed from source on M2 chips. I'm working on a solution to provide datasail without the clustering algorithms (at least for win-64 and m2 chips.

In the mean time, there are two solutions: Either switch to osx (or disable the M2 chip somehow, there's a way to get this done, but I'm not a Mac-User so, I don't know how) or install datasail from source by cloning the repo, removing the problematic dependencies from the meta.yaml and building it locally.

cwinsnes commented 9 months ago

I thought working within the Docker container would alleviate that issue as it's Linux x64.

If compiling from source is the way, I'll try that

Old-Shatterhand commented 9 months ago

That's actually a good point. Intuitively, I'd also assume mamba should go for the linux-64 build. Maybe some interference between the M2 chip and Docker?

I have another idea: Can it be a problem with python? You're not specifying the version, so mamba may fall back to some default version, maybe python3.7? Can you try running mamba create -n sail -c conda-forge -c bioconda -c kalininalab python=3.10 datasail

cwinsnes commented 9 months ago

The problem unfortunately persists when I run the command with python=3.10.

Could definitely be some interference between M2-chip and Docker. Not fully sure on how to approach it otherwise.

Old-Shatterhand commented 9 months ago

Starting from v1, we will provide a datasail-lite package that comes without all the clustering tools and must be installed by the user. This will make installing DataSAIL easier and more lightweight, as only those currently used dependencies need to be installed.

Old-Shatterhand commented 8 months ago

This has been solved with commit 5fe4768 and in PR #22.