OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

problem installing the rMLST database with confindr 0.7.4 #32

Closed ysevel closed 1 year ago

ysevel commented 2 years ago

Hi,

I'm trying to install the rmlst database for all orgnism and I have run into an error with the confindr 0.7;4 version:

(base) y_sevellec@LAPTOP-ESE47E7K:/mnt/c/Users/yseve/Documents/test_univ_Renne$ confindr_database_setup -s login_pubmlst.txt Traceback (most recent call last): File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 568, in _build_master ws.require(requires) File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 886, in require needed = self.resolve(parse_requirements(requirements)) File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 777, in resolve raise VersionConflict(dist, req).with_context(dependent_req) pkg_resources.VersionConflict: (confindr 0.7.4 (/home/y_sevellec/miniconda3/lib/python3.9/site-packages), Requirement.parse('confindr==0.7.0'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/y_sevellec/miniconda3/bin/confindr_database_setup", line 6, in from pkg_resources import load_entry_point File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 3243, in def _initialize_master_working_set(): File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 3226, in _call_aside f(*args, **kwargs) File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 3255, in _initialize_master_working_set working_set = WorkingSet._build_master() File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 570, in _build_master return cls._build_from_requirements(requires) File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 583, in _build_from_requirements dists = ws.resolve(reqs, Environment()) File "/home/y_sevellec/miniconda3/lib/python3.9/site-packages/pkg_resources/init.py", line 772, in resolve raise DistributionNotFound(req, requirers) pkg_resources.DistributionNotFound: The 'confindr==0.7.0' distribution was not found and is required by the application

any idea how to solve the problem?

Thank you in advance,

Best regards,

Yann

adamkoziol commented 2 years ago

Hi Yann,

I'm wondering if your problem is from your version of Python.

Could you please try creating a Python 3.7 conda environment to see if it addresses the issue.

conda create –n confindr python=3.7 conda activate confindr conda install –c bioconda mash=2.3 conda install –c bioconda confindr

Alternatively, if that gives you an error, another user had success installing ConFindr as follows:

conda create -n confindr -c conda-forge -c bioconda python=3.7 mash=2.3 conda activate confindr conda install -c conda-forge -c bioconda confindr

Please let me know if this helps.

Adam

ysevel commented 2 years ago

Hi Adam,

So I now can download the database but I got a new error when the script try to combine the files:

2022-02-07 17:04:55 Combining rMLST files... Traceback (most recent call last): File "/home/y_sevellec/miniconda3/envs/confindr_test/bin/confindr_database_setup", line 10, in sys.exit(main()) File "/home/y_sevellec/miniconda3/envs/confindr_test/lib/python3.7/site-packages/confindr_src/database_setup.py", line 270, in main args.secret_file) File "/home/y_sevellec/miniconda3/envs/confindr_test/lib/python3.7/site-packages/confindr_src/database_setup.py", line 209, in setup_confindr_database record.seq._data = record.seq._data.replace('-', '').replace('N', '') TypeError: a bytes-like object is required, not 'str'

jacquikeane commented 2 years ago

Any progress on this? I am seeing the same error as @ysevel reported above "TypeError: a bytes-like object is required, not 'str'". I have confindr 0.7.4 installed via miniconda and my Python version is 3.7.12. Mash 2.3 is also in the environment.

ysevel commented 2 years ago

Hi, nope, I still have no solution on this. does anyone have a already compiled version of the database somewhere?

annacorreia commented 2 years ago

Hi, I had the same problem and am using Confindr 0.7.4, Python 3.7.12, Mash 2.3. I managed to sort out the problem by editing the file database_setup.py and using Adam's advice from here: https://github.com/OLC-Bioinformatics/ConFindr/issues/27#issuecomment-952268919

ysevel commented 2 years ago

Hi, I had the same problem and am using Confindr 0.7.4, Python 3.7.12, Mash 2.3. I managed to sort out the problem by editing the file database_setup.py and using Adam's advice from here: #27 (comment)

Thank you very much for this tip! The fix work partially: I was abke to generate the dataset but when running confindr I got the following error when running confidr with --rmlst. the cgMLST databse work just fine but with rMLST databases I got the following error:

Traceback (most recent call last): File "/home/genouest/cnrs_umr6553/ysevellec/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1067, in confindr fasta=args.fasta) File "/home/genouest/cnrs_umr6553/ysevellec/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 647, in find_contamination returncmd=True, threads=threads) File "/home/genouest/cnrs_umr6553/ysevellec/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/wrappers/bbtools.py", line 258, in bbduk_bait out, err = run_subprocess(cmd) File "/home/genouest/cnrs_umr6553/ysevellec/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/wrappers/bbtools.py", line 16, in run_subprocess raise subprocess.CalledProcessError(x.returncode, cmd=command) subprocess.CalledProcessError: Command 'bbduk.sh in=/groups/geh/MGItransfert/L04/V350045618_L04_10_1.fq.gz outm=/groups/geh/confindr/L04_contam/V350045618_L04_10_1/rmlst.fastq.gz ref=/scratch/ysevellec/script/confindr_db/Bacillus_db.fasta threads=16' returned non-zero exit status 1.

java.lang.Exception: An input file appears to be misformatted: The character with ASCII code 39 appeared where a base was expected: ''' Sequence #0 Sequence ID: 'BACT000001_693' Sequence: '[65, 84, 71, 65, 67, 65, 71, 65, 71, 71, 65, 65, 65, 84, 71, 65, 65, 84, 67, 65, 65, 65, 84, 84, 71, 65, 84, 71, 84, 84, 67, 65, 65, 71, 84, 71, 67, 67, 65, 71, 65, 71, 71, 84, 84, 71, 71, 65, 71, 65, 84, 71, 84, 65, 71, 84, 65, 65, 65, 65, 71, 71, 71, 65, 84, 84, 71, 84, 71, 65, 67, 65, 65, 65, 71, 71, 84, 65, 71, 65, 71, 71, 65, 67, 65, 65, 71, 67, 65, 84, 71, 84, 65, 71, 65, 84, 71, 84, 67, 71, 65, 65, 65, 84, 84, 71, 84, 67, 65, 65, 84, 71, 84, 67, 65, 65, 65, 67, 65, 71, 84, 67, 67, 71, 71, 65, 65, 84, 67, 65, 84, 84, 67, 67, 65, 65, 84, 67, 65, 71, 84, 71, 65, 65, 84, 84, 65, 84, 67, 65, 65, 71, 84, 67, 84, 84, 67, 65, 84, 71, 84, 65, 71, 65, 71, 65, 65, 65, 71, 67, 65, 84, 67, 71, 71, 65, 84, 71, 84, 67, 71, 84, 65, 65, 65, 65, 71, 84, 84, 71, 65, 67, 71, 65, 67, 71, 65, 71, 67, 84, 84, 71, 65, 67, 67, 84, 71, 65, 65, 65, 71, 84, 65, 65, 67, 65, 65, 65, 65, 71, 84, 71, 71, 65, 65, 71, 65, 67, 71, 65, 84, 71, 67, 84, 84, 84, 71, 65, 84, 84, 84, 84, 65, 84, 67, 84, 65, 65, 65, 67, 71, 84, 71, 67, 67, 71, 84, 84, 71, 65, 84, 71, 67, 84, 71, 65, 67, 67, 71, 67, 71, 67, 84, 84, 71, 71, 71, 65, 65, 71, 65, 67, 67, 84, 84, 71, 65, 65, 65, 65, 65, 65, 65, 65, 84, 84, 67, 71, 65, 67, 65, 67, 65, 65, 65, 65, 71, 65, 65, 71, 84, 71, 84, 84, 84, 71, 65, 65, 71, 67, 84, 71, 65, 65, 71, 84, 71, 65, 65, 65, 71, 65, 84, 71, 84, 71, 71, 84, 71, 65, 65, 65, 71, 71, 67, 71, 71, 84, 67, 84, 67, 71, 84, 67, 71, 84, 84, 71, 65, 84, 65, 84, 67, 71, 71, 67, 71, 84, 84, 67, 71, 67, 71, 71, 67, 84, 84, 67, 65, 84, 84, 67, 67, 67, 71, 67, 65, 84, 67, 65, 67, 84, 84, 71, 84, 67, 71, 65, 65, 71, 67, 84, 67, 65, 84, 84, 84, 67, 71, 84, 67, 71, 65, 71, 71, 65, 84, 84, 84, 67, 65, 67, 84, 71, 65, 67, 84, 65, 84, 65, 65, 65, 71, 71, 71, 65, 65, 65, 65, 67, 67, 67, 84, 67, 84, 67, 84, 67, 84, 84, 65, 84, 67, 71, 84, 67, 71, 84, 84, 71, 65, 65, 67, 84, 67, 71, 65, 67, 67, 71, 84, 71, 65, 84, 65, 65, 65, 65, 65, 67, 67, 71, 71, 71, 84, 71, 65, 84, 84, 67, 84, 71, 84, 67, 65, 67, 65, 67, 67, 71, 71, 71, 67, 84, 71, 84, 65, 71, 84, 71, 71, 65, 65, 65, 65, 65, 71, 65, 65, 67, 65, 71, 65, 67, 65, 71, 67, 67, 65, 65, 65, 65, 65, 71, 67, 65, 84, 71, 65, 84, 84, 84, 84, 67, 84, 67, 67, 65, 65, 65, 67, 71, 67, 84, 84, 71, 65, 71, 71, 84, 84, 71, 71, 65, 65, 71, 67, 71, 84, 71, 67, 84, 84, 71, 65, 65, 71, 71, 65, 65, 65, 65, 71, 84, 71, 67, 65, 71, 67, 71, 84, 67, 84, 71, 65, 67, 84, 71, 65, 84, 84, 84, 67, 71, 71, 67, 71, 67, 65, 84, 84, 84, 71, 84, 67, 71, 65, 67, 65, 84, 67, 71, 71, 67, 71, 71, 65, 65, 84, 84, 71, 65, 67, 71, 71, 71, 67, 84, 84, 71, 84, 84, 67, 65, 84, 65, 84, 84, 84, 67, 71, 67, 65, 71, 67, 84, 71, 84, 67, 84, 67, 65, 84, 84, 67, 65, 67, 65, 67, 71, 84, 67, 71, 65, 65, 65, 65, 65, 67, 67, 71, 84, 67, 84, 71, 65, 67, 71, 84, 71, 71, 84, 71, 71, 65, 65, 71, 65, 65, 71, 71, 84, 67, 65, 71, 71, 65, 67, 71, 84, 65, 65, 65, 65, 71, 84, 71, 65, 65, 65, 71, 84, 65, 67, 84, 71, 84, 67, 67, 71, 84, 65, 71, 65, 67, 67, 71, 67, 71, 65, 84, 65, 65, 67, 71, 65, 65, 67, 71, 84, 65, 84, 84, 84, 67, 84, 84, 84, 65, 84, 67, 84, 65, 84, 84, 65, 65, 65, 71, 65, 65, 65, 67, 71, 67, 84, 71, 67, 67, 71, 71, 71, 65, 67, 67, 84, 84, 71, 71, 65, 71, 67, 67, 65, 71, 71, 84, 67, 71, 71, 67, 71, 65, 65, 65, 65, 71, 71, 84, 65, 65, 65, 65, 67, 65, 71, 71, 71, 65, 71, 65, 84, 71, 84, 71, 67, 84, 84, 71, 65, 65, 71, 71, 65, 65, 65, 65, 71, 84, 71, 67, 65, 71, 67, 71, 67, 67, 84, 84, 71, 84, 65, 65, 71, 67, 84, 84, 67, 71, 71, 67, 71, 67, 67, 84, 84, 67, 71, 84, 84, 71, 65, 65, 65, 84, 84, 67, 84, 84, 67, 67, 71, 71, 71, 67, 71, 84, 71, 71, 65, 65, 71, 71, 65, 67, 84, 84, 71, 84, 71, 67, 65, 67, 65, 84, 84, 84, 67, 71, 67, 65, 65, 65, 84, 84, 84, 67, 67, 65, 65, 84, 65, 65, 65, 67, 65, 84, 65, 84, 67, 71, 71, 65, 65, 67, 71, 67, 67, 71, 67, 65, 84, 71, 65, 65, 71, 84, 71, 67, 84, 84, 71, 65, 65, 71, 65, 65, 71, 71, 65, 67, 65, 71, 65, 67, 84, 71, 84, 84, 65, 65, 65, 71, 84, 71, 65, 65, 65, 71, 84, 71, 67, 84, 84, 71, 65, 67, 71, 84, 71, 65, 65, 67, 71, 65, 65, 65, 71, 67, 71, 65, 65, 71, 65, 71, 67, 71, 67, 65, 84, 84, 84, 67, 67, 84, 84, 65, 65, 71, 67, 65, 84, 71, 67, 71, 84, 71, 65, 71, 67, 84, 84, 71, 65, 65, 71, 65, 65, 71, 67, 71, 67, 67, 71, 65, 65, 65, 71, 67, 67, 71, 65, 84, 67, 65, 71, 71, 65, 71, 71, 65, 67, 84, 84, 67, 67, 71, 67, 67, 65, 65, 84, 65, 67, 67, 65, 71, 71, 67, 71, 65, 65, 65, 71, 65, 65, 71, 65, 71, 67, 67, 71, 65, 71, 67, 65, 67, 67, 71, 71, 67, 84, 84, 67, 67, 65, 71, 67, 84, 84, 71, 71, 67, 71, 65, 84, 84, 84, 65, 65, 84, 67, 71, 71, 65, 71, 65, 67, 65, 65, 71, 67, 84, 84, 65, 65, 84, 65, 65, 65, 84, 84, 65, 65, 65, 65, 84, 65, 65, 39] ATGACAGAGGAAATGAATCAAATTGATGTTCAAGTGCCAGAGGTTGGAGATGTAGTAAAAGGGATTGTGACAAAGGTAGAGGACAAGCATGTAGATGTCGAAATTGTCAATGTCAAACAGTCCGGAATCATTCCAATCAGTGAATTATCAAGTCTTCATGTAGAGAAAGCATCGGATGTCGTAAAAGTTGACGACGAGCTTGACCTGAAAGTAACAAAAGTGGAAGACGATGCTTTGATTTTATCTAAACGTGCCGTTGATGCTGACCGCGCTTGGGAAGACCTTGAAAAAAAATTCGACACAAAAGAAGTGTTTGAAGCTGAAGTGAAAGATGTGGTGAAAGGCGGTCTCGTCGTTGATATCGGCGTTCGCGGCTTCATTCCCGCATCACTTGTCGAAGCTCATTTCGTCGAGGATTTCACTGACTATAAAGGGAAAACCCTCTCTCTTATCGTCGTTGAACTCGACCGTGATAAAAACCGGGTGATTCTGTCACACCGGGCTGTAGTGGAAAAAGAACAGACAGCCAAAAAGCATGATTTTCTCCAAACGCTTGAGGTTGGAAGCGTGCTTGAAGGAAAAGTGCAGCGTCTGACTGATTTCGGCGCATTTGTCGACATCGGCGGAATTGACGGGCTTGTTCATATTTCGCAGCTGTCTCATTCACACGTCGAAAAACCGTCTGACGTGGTGGAAGAAGGTCAGGACGTAAAAGTGAAAGTACTGTCCGTAGACCGCGATAACGAACGTATTTCTTTATCTATTAAAGAAACGCTGCCGGGACCTTGGAGCCAGGTCGGCGAAAAGGTAAAACAGGGAGATGTGCTTGAAGGAAAAGTGCAGCGCCTTGTAAGCTTCGGCGCCTTCGTTGAAATTCTTCCGGGCGTGGAAGGACTTGTGCACATTTCGCAAATTTCCAATAAACATATCGGAACGCCGCATGAAGTGCTTGAAGAAGGACAGACTGTTAAAGTGAAAGTGCTTGACGTGAACGAAAGCGAAGAGCGCATTTCCTTAAGCATGCGTGAGCTTGAAGAAGCGCCGAAAGCCGATCAGGAGGACTTCCGCCAATACCAGGCGAAAGAAGAGCCGAGCACCGGCTTCCAGCTTGGCGATTTAATCGGAGACAAGCTTAATAAATTAAAATAA''

This can be bypassed with the flag 'tossjunk', 'fixjunk', or 'ignorejunk' at shared.KillSwitch.kill(KillSwitch.java:96) at stream.Read.validateCommonCase_branchless(Read.java:412) at stream.Read.validate(Read.java:115) at stream.Read.(Read.java:77) at stream.Read.(Read.java:50) at stream.FastaReadInputStream.generateRead(FastaReadInputStream.java:270) at stream.FastaReadInputStream.fillList(FastaReadInputStream.java:184) at stream.FastaReadInputStream.hasMore(FastaReadInputStream.java:109) at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:668) at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:657)

pcrxn commented 1 year ago

Fixed by 19d0d1d in v0.8.1.