mdmparis / defense-finder

Systematic search of all known anti-phage systems.
GNU General Public License v3.0
73 stars 12 forks source link

FileNotFoundError: [Errno 2] No such file or directory: '/defense-finder-tmp/RM/best_solution.tsv' when using --db-type unordered #19

Closed felipehcoutinho closed 6 months ago

felipehcoutinho commented 1 year ago

Hi there! I am trying to run defense-finder on a large dataset of MAGs. These are medium to high quality genomes (CheckM Completeness >= 50% and Contamination <= 5%), nevertheless, the assemblies are fragmented (mean 209 scaffolds per MAG, and mean N50 = 39 Kbp). Hence, I opted for using the --db-type unordered option, as recommended in the documentation. Thus I ran:

defense-finder run --workers 1 --db-type unordered --out-dir /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/TestDF_Unordered_Bin_S11.9 --models-dir /mnt/lustre/repos/bio/databases/public/defensefinder/models/ /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/CDS_by_MAG/Bin_S11.9.faa &> TestDF_Unordered_Bin_S11.9.out.txt

which died with the following error: Traceback (most recent call last): File "/home/apps/defensefinder/1.0.9/bin/defense-finder", line 8, in <module> sys.exit(cli()) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/defense_finder_cli/main.py", line 76, in run defense_finder_posttreat.run(tmp_dir, outdir) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/defense_finder_posttreat/__init__.py", line 9, in run bs = best_solution.get(tmp_dir) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/defense_finder_posttreat/best_solution.py", line 10, in get acc = acc + parse_best_solution(family_path) File "/home/apps/defensefinder/1.0.9/lib/python3.8/site-packages/defense_finder_posttreat/best_solution.py", line 15, in parse_best_solution tsv_file = open(os.path.join(dir, 'best_solution.tsv')) FileNotFoundError: [Errno 2] No such file or directory: '/mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/TestDF_Unordered_EColi_K12/defense-finder-tmp/RM/best_solution.tsv'

The following output directories were generated in " TestDF_Unordered_Bin_S11.9/defense-finder-tmp/": Cas DF_1 DF_2 DF_3 DF_4 DF_5 RM

I tried again by removing the --db-type unordered option with:

defense-finder run --workers 24 --db-type unordered --out-dir /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/TestDF_Unordered_EColi_K12 --models-dir /mnt/lustre/repos/bio/databases/public/defensefinder/models/ /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/Prokka_NZ_CP014270/NZ_CP014270.faa &> TestDF_Unordered_NZ_CP014270.1.out.txt

This ran without any errors and the directory "TestDF_Ordered_Bin_S11.9/" contained: defense_finder_genes.tsv defense_finder_hmmer.tsv defense_finder_systems.tsv

I thought the issue might be due to the fragmented genome, so I tried with a complete E. coli K-12 genome from RefSeq assembled into a single contig (NZ_CP014270.1) by running:

defense-finder run --workers 24 --out-dir /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/TestDF_Ordered_EColi_K12 --models-dir /mnt/lustre/repos/bio/databases/public/defensefinder/models/ /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/Prokka_NZ_CP014270/NZ_CP014270.faa &> TestDF_Ordered_NZ_CP014270.1.out.txt

The above command ran with no issues and the expected output files were generated.

defense-finder run --workers 24 --db-type unordered --out-dir /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/TestDF_Unordered_EColi_K12 --models-dir /mnt/lustre/repos/bio/databases/public/defensefinder/models/ /mnt/lustre/scratch/fcoutinho/Profiles_Malaspina/Assemblies_Round2/Metabat_Binning/Redo_MetaBat_Bins_Round_1/Defense_Finder_Results/Prokka_NZ_CP014270/NZ_CP014270.faa &> TestDF_Unordered_NZ_CP014270.1.out.txt

While the above command failed with the same error, and it did not produce the expected final output files.

Could you please tell me how to get defense-finder to work using the --db-type unordered option? Alternatively, if it cannot be made to work in this mode, could you tell me what would be the limitations associated with running it with default parameters in these set of MAGs? I expect a lot of false negatives due to defense systems being split across multiple scaffolds. But could this also lead to any false positives?

Regards,

Felipe

hsbi commented 1 year ago

I'm having the same issue -- has this been addressed anywhere? Thanks!

hsbi commented 1 year ago

It seems like when running with --db-type unordered, Macsyfinder doesn't output a best_solution.tsv file at all. See their doc: https://macsyfinder.readthedocs.io/en/latest/user_guide/outputs.html#output-files-for-the-unordered-replicon-search-mode

github-actions[bot] commented 6 months ago

This issue has been inactive for 60 days and is now marked as stale. It will be closed in 7 days without further activity.