Open mmolari opened 5 months ago
Dear Marco, Thank you for bringing this to our attention. Yes, unfortunately it is a bug in MacSyFinder that was uncovered by new DefenseFinder models. Until they update MacSyFinder with the fix on conda, the DefenseFinder component of the pipeline won't work.
Until it is fixed, you could run Beav with --skip_defensefinder to skip running DefenseFinder and it should run the rest of the pipeline.
Alternatively, you could download the updated file from the MacSyFinder commit (https://github.com/gem-pasteur/macsyfinder/commit/27ee21ceb8e7100d9183b084356f791487aca4ad) and copy it into the corresponding folders in macsyfinder in your conda environment. You would only need to add in the registries.py file for it to work.
To do so, with your conda environment activated:
get your python version: python --version
Mine is python 3.9, so fill that in in the following cp commands:
wget https://github.com/gem-pasteur/macsyfinder/blob/27ee21ceb8e7100d9183b084356f791487aca4ad/macsypy/registries.py
cp registries.py $CONDA_PREFIX/lib/python3.9/site-packages/macsypy/
Thank you for the quick answer!
No problem! We have a new version coming soon that will fix the other bugs that appeared in your run log. Hopefully that will be up later this week.
From: Marco Molari @.> Date: Tuesday, January 23, 2024 at 10:13 AM To: weisberglab/beav @.> Cc: Alexandra Weisberg @.>, Comment @.> Subject: Re: [weisberglab/beav] Error in DefenseFinder (Issue #1)
Thank you for the quick answer!
— Reply to this email directly, view it on GitHubhttps://github.com/weisberglab/beav/issues/1#issuecomment-1906642638, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC4UVIIIG4IXG7R2TNWBAV3YP74TLAVCNFSM6AAAAABCGXF3TOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWGY2DENRTHA. You are receiving this because you commented.Message ID: @.***>
Hi!
First of all, thank you for putting together such a nice pipeline! It's very convenient to have all of these tools in one place.
I installed BEAV using conda as per instruction on the readme. I downloaded the
light
version of the database, and then ran BEAV with the command:The first issue I encountered is with DefenseFinder. From the BEAV log file:
Here is the DefenseFinder.log
``` 2024-01-23 11:00:41 | [32mINFO [0m | [32mReceived file ./bakta/NZ_CP124487.1.fa.faa[0m 2024-01-23 11:00:41 | [33mWARNING [0m | [33mOut directory /home/marco/ownCloud/neherlab/code/pangenome-evo/exploration/2401c_beav/test/NZ_CP124487.1.fa already exists. Existing DefenseFinder output will be overwritten[0m 2024-01-23 11:00:41 | [32mINFO [0m | [32mRunning DefenseFinder[0m Traceback (most recent call last): File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/profile.py", line 70, in get_profile path = model_location.get_profile(gene.name) File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/registries.py", line 344, in get_profile return self._profiles[name] KeyError: 'Rst_Hydrolase-Tm__Hydrolase-Tm' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/marco/miniconda3/envs/beav/bin/defense-finder", line 10, inI believe this is a known issue and was already raised here. I just wanted to bring it up here as well so that once it is fixed you can update the version as well, or have a temporary fix in the meantime.
For completeness here is the full output of the BEAV command
``` BEAV version 1.0.0 --input /home/marco/ownCloud/neherlab/code/pangenome-evo/data/fa/NZ_CP124487.1.fa --output test --threads 4 --skip_tiger --skip_gapmind --skip_dbscan-swa --skip_antismash --bakta_arguments --db /home/marco/miniconda3/envs/beav/db/db-light Checking prerequisites: ---------------------------------------------------------- Bakta: OK antiSMASH: skipped MacSyFinder: OK IntegronFinder: OK DefenseFinder: OK TIGER2: skipped GapMind: skipped DBSCAN-SWA: skipped ---------------------------------------------------------- Running Bakta Elapsed: 0hrs 8min 53sec Done ---------------------------------------------------------- Annotation of other sequence elements cut: ./borders/NZ_CP124487.1.fa.virbox: No such file or directory cut: ./borders/NZ_CP124487.1.fa.trabox: No such file or directory Elapsed: 0hrs 0min 0sec Done ---------------------------------------------------------- Indentifying oriT Elapsed: 0hrs 0min 5sec Done ---------------------------------------------------------- Identifying secretion systems (MacSyFinder) Elapsed: 0hrs 0min 5sec Done ---------------------------------------------------------- Identifying integrons (IntegronFinder) Elapsed: 0hrs 0min 14sec Done ---------------------------------------------------------- Identifying defense systems (DefenseFinder) Error: error occurred while running DefenseFinder. Please see defensefinder.log Elapsed: 0hrs 0min 1sec cut: ./NZ_CP124487.1.fa_defense_finder_genes.tsv: No such file or directory Done ---------------------------------------------------------- Identifying biosynthetic gene clusters (antiSMASH) Skipped ---------------------------------------------------------- Identifying phage (DBSCAN-SWA) Skipped ---------------------------------------------------------- Characterizing amino acid biosynthesis and small carbon metabolite catabolism (GapMind) Skipped ---------------------------------------------------------- Identifying integrative conjugative elements [ICEs] (TIGER2) Skipped ---------------------------------------------------------- Combining annotations and preparing final output files tee: NZ_CP124487.1.fa/logs/Beav.log: No such file or directory Elapsed: 0hrs 0min 46sec Final annotation output: NZ_CP124487.1.fa_final.gbk ---------------------------------------------------------- Creating Circos Map ls: cannot access 'test/NZ_CP124487.1.fa/*_final.gbk': No such file or directory cat: 'test/NZ_CP124487.1.fa/*oncogenic_plasmid_final.out.contiglist': No such file or directory python3 beav_circos.py --input usage: beav_circos.py [-h] --input INPUT [--contigs [CONTIGS ...]] [--plasmid PLASMID] beav_circos.py: error: argument --input/-i: expected one argument Elapsed: 0hrs 0min 1sec Done mv: cannot stat 'NZ_CP124487.1.fa.circos.png': No such file or directory mv: cannot stat 'NZ_CP124487.1.fa.circos.pdf': No such file or directory mv: cannot stat 'NZ_CP124487.1.fa.oncogenes.png': No such file or directory mv: cannot stat 'NZ_CP124487.1.fa.oncogenes.pdf': No such file or directory ---------------------------------------------------------- Summary of annotations Secretion_Systems Defense_Systems Phages Biosynthetic_gene_clusters ICEs Integrons /home/marco/miniconda3/envs/beav/bin/beav: line 1063: N/A: No such file or directory 6 N/A N/A N/A N/A 0 Small carbon catabolism pathways: Done ---------------------------------------------------------- The BEAV pipeline automates the use of a number of published software tools. If you use these results in a publication, please include the following in your methods section and cite: Jung J, Rahman A, Schiffer A, and Weisberg A. 2023. BEAV: a bacterial genome and mobile element annotation pipeline. https://github.com/weisberglab/beav grep: test/NZ_CP124487.1.fa/logs/bakta.log: No such file or directory Bakta version Schwengers O, Jelonek L, Dieckmann MA, et al. 2021. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom 7: 000685. EMBOSS:fuzznuc EMBOSS: The European Molecular Biology Open Software Suite (2000) Rice,P. Longden,I. and Bleasby,A. Trends in Genetics 16, (6) pp276--277 head: cannot open 'test/NZ_CP124487.1.fa/MacSyFinder_TXSS/macsyfinder.log' for reading: No such file or directory MacSyFinder version Néron, Bertrand; Denise, Rémi; Coluzzi, Charles; Touchon, Marie; Rocha, Eduardo P.C.; Abby, SophieS 2023. MacSyFinder v2: Improved modelling and search engine to identify molecular systems igenomes. Peer Community Journal, Volume 3, article no. e28. DOI: 10.24072/pcjournal.250. DefenseFinder Tesson F., Hervé A. , Touchon M., d’Humières C., Cury J., Bernheim A. Systematic and quantitative view of the antiviral arsenal of prokaryotes bioRx grep: test/NZ_CP124487.1.fa/Integron_Finder/Results_Integron_Finder_NZ_CP124487.1.fa/integron_finder.out: No such file or directory IntegronFinder version Néron B, Littner E, Haudiquet M, et al. 2022. IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella. Microorganisms 10: 700. ```Thanks again!
Marco