Closed Ge0rges closed 6 months ago
I confirmed this occurs in v8 as well.
Ok figured this out. Turns out it is a known issue with CONCOCT due to the fact that it is no longer compatible with the latest versions of sklearn.
If CONCOCT was installed with Conda this would not be an issue as the Conda recipe caps the sklearn version. However that is not the case if one follows the anvio instructions. @meren what's the best solution here? Either change the way CONCOCT is installed to use Conda, or change the Anvi'o instructions to use either A) a singularity container of CONCOCT (a pain) or B) cap the sklearn version of anvi'o (probably a pain later since CONCOCT doesn't seem to be maintained), and there's always C) nothing but print a warning.
I confirmed this by doing pip install scikit-learn==1.1.0
in my anvi'o environment. After that, anvi-cluster-contigs
completes successfully.
Thank you very much for looking into this, @Ge0rges. I'll take a look and see if I can come up with a workaround for this. The current version of sklearn is 1.2.2
. In the worst case scenario we can require 1.1.1
.
I confirmed this by doing
pip install scikit-learn==1.1.0
in my anvi'o environment. After that,anti-cluster-contigs
completes successfully.
I did same and concoct worked fine but I wonder if running pip install scikit-learn==1.1.0 could break Anvio rules somewhere else ?
Since you were able to do the downgrade, it means the environment is stable. If this version breaks something, you will certainly notice that :) I think you're good.
I am getting a new error with ecophylo workflow which was working fine before ```
RuleException:
TypeError in file /user/suga8254/.conda/envs/anvio-8/lib/python3.10/site-packages/anvio/workflows/ecophylo/Snakefile, line 358:
StringMethods.rsplit() takes from 1 to 2 positional arguments but 3 positional arguments (and 1 keyword-only argument) were given
File "/user/suga8254/.conda/envs/anvio-8/lib/python3.10/site-packages/anvio/workflows/ecophylo/Snakefile", line 358, in __rule_process_hmm_hits
File "/user/suga8254/.conda/envs/anvio-8/lib/python3.10/site-packages/pandas/core/strings/accessor.py", line 136, in wrapper
File "/user/suga8254/.conda/envs/anvio-8/lib/python3.10/concurrent/futures/thread.py", line 58, in run
That is why I am wondering !!
Hi @Sabrin2020 can you confirm that it worked just by changing the scikit-learn version? i.e. if you upgrade scikit it works again?
I just did that as test by going back to scikit-learn==1.2.2
and true it did not change and the ecophylo error still persist
I would open a separate issue with your error with steps to reproduce.
This is weird. Under no circumstance a change in scikit version number should cause an error in the threads module of Python. Probably these two things are independent :( But as a test, you can reinstall the anvi'o environment from scratch to see if you can reproduce it, @Sabrin2020.
thanks @meren @Ge0rges I will reinstall the anvi'o environment from scratch
@meren reinstalled the anvi'o environment and no loger have this error StringMethods.rsplit() takes from 1 to 2 positional arguments but 3 positional arguments (and 1 keyword-only argument) were given
I did not installed concoct in same environment yet.
@meren may be useful to add a warning about this somewhere near the CONCOCT installation instructions on the website perhaps.
I agree. Since we are no longer doing a lot of genome binning in the lab, those parts of the code and documentation is at the mercy of those who are using them outside :) If someone could formulate a warning text I could immediately put it somewhere in our installation instructions.
Sure, meant to be somewhere near the CONCOCT install instructions:
Users should not that they may encounter an error when running CONCOCT of type
TypeError
. Please see here for more information about this. Here's the fix in a gist, at the end of your install and while in your conda environment do:pip install scikit-learn==1.1.0
. Please let us know if this fix breaks any other part of Anvi'o. As ofv8
we don't think it does.
Thank you @Ge0rges. I updated the installation instructions. Now there is a little note that looks like this:
Short description of the problem
This issue is meant to represent the following discord thread. I too encountered this error and decided to open this since nobody else has. It seems anvio is not interacting with CONCOCT properly.
anvi'o version
System info
Using rocky linux and installed following the dev instructions on the website.
Detailed description of the issue
In my case I ran
anvi-cluster-contigs -p SAMPLES-MERGED/PROFILE.db -c CONTIGS.db --driver concoct -T 80 --clusters 10 -C METABINS --just-do-it
. I then obtained a config error from anvio complaining it's missing a file. I went to the log and see:Files / commands to reproduce the issue
anvi-cluster-contigs -p SAMPLES-MERGED/PROFILE.db -c CONTIGS.db --driver concoct -T 80 --clusters 10 -C METABINS --just-do-it
My files are too big to share unfortunately.