mdmparis / defense-finder

Systematic search of all known anti-phage systems.
GNU General Public License v3.0
76 stars 13 forks source link

--db-type unordered is not processed correctly #40

Closed jeanrjc closed 2 months ago

jeanrjc commented 11 months ago

When using --db-type unordered when the CDS in the input are not ordered, we have an error :

 2023-12-12 11:58:23 | INFO     | Post-treatment of the data
Traceback (most recent call last):
  File "/home/jean/anaconda3/bin/defense-finder", line 8, in <module>
    sys.exit(cli())
  File "/home/jean/anaconda3/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jean/anaconda3/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/defense_finder_cli/main.py", line 145, in run
    defense_finder_posttreat.run(tmp_dir, outdir, os.path.splitext(os.path.basename(filename))[0])
  File "/home/jean/anaconda3/lib/python3.8/site-packages/defense_finder_posttreat/__init__.py", line 11, in run
    bs = best_solution.get(tmp_dir)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/defense_finder_posttreat/best_solution.py", line 13, in get
    acc = acc + parse_best_solution(family_path)
  File "/home/jean/anaconda3/lib/python3.8/site-packages/defense_finder_posttreat/best_solution.py", line 36, in parse_best_solution
    with open(os.path.join(dir, 'best_solution.tsv')) as tsv_file:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jean/Downloads/test_aa2.out/defense-finder-tmp/DF_3/best_solution.tsv'

This is because macsyfinder does not output best_solution.tsv when using unordered replicon as input. We need to fix that.

github-actions[bot] commented 8 months ago

Stale issue message

schmittel commented 8 months ago

RE: https://github.com/mdmparis/defense-finder/issues/46

Thanks for your help. To answer your question, here's the command I used:

defense-finder \
        run \
            --out-dir /defense_finder/output/GCF_030273315.1_ASM3027331v1_genomic \
            --workers 50 \
            --db-type unordered \
            --preserve-raw \
            /host_combined/GCF_030273315.1_ASM3027331v1_genomic.faa; \

Here's the error I'm getting:

command used: /tmp/conda_envs/defense/bin/macsyfinder --db-type unordered --sequence-db /host_combined/GCF_030273315.1_ASM3027331v1_genomic.faa --models defense-finder-models/Cas all --out-dir /defense_finder/output/GCF_030273315.1_ASM3027331v1_genomic/defense-finder-tmp/Cas --accessory-weight 1 --exchangeable-weight 1 --coverage-profile 0.4 -w 50
Traceback (most recent call last):
  File "/tmp/conda_envs/defense/bin/macsyfinder", line 8, in <module>
    sys.exit(main())
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/macsypy/scripts/macsyfinder.py", line 1054, in main
    models_def_to_detect = get_def_to_detect(config.models(), model_registry)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/macsypy/utils.py", line 48, in get_def_to_detect
    def_to_detect = model_loc.get_all_definitions(root_def_name=root)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/macsypy/registries.py", line 309, in get_all_definitions
    root_def = self.get_definition(root_def_name)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/macsypy/registries.py", line 292, in get_definition
    raise ValueError(f"{level} does not match with any definitions")
ValueError: Cas does not match with any definitions
Traceback (most recent call last):
  File "/tmp/conda_envs/defense/bin/defense-finder", line 8, in <module>
    sys.exit(cli())
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/defense_finder_cli/main.py", line 76, in run
    defense_finder_posttreat.run(tmp_dir, outdir)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/defense_finder_posttreat/__init__.py", line 9, in run
    bs = best_solution.get(tmp_dir)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/defense_finder_posttreat/best_solution.py", line 10, in get
    acc = acc + parse_best_solution(family_path)
  File "/tmp/conda_envs/defense/lib/python3.10/site-packages/defense_finder_posttreat/best_solution.py", line 15, in parse_best_solution
    tsv_file = open(os.path.join(dir, 'best_solution.tsv'))
FileNotFoundError: [Errno 2] No such file or directory: '/defense_finder/output/GCF_030273315.1_ASM3027331v1_genomic/defense-finder-tmp/DF_1/best_solution.tsv'

The input genome is: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_030273315.1/

Thanks again

github-actions[bot] commented 5 months ago

This issue has been inactive for 60 days and is now marked as stale. It will be closed in 7 days without further activity.

schmittel commented 5 months ago

Further activity

JPegorino commented 4 months ago

I am also experiencing this issue with the 'unordered' parameter - copied my code and error message below. In my case, the CDSs in the input file should be in order (although I would like to run defense-finder on some unordered faa [gene multifasta] files too).

(DefenseFinder) JPegorino@lnx-0001:~/work/CCs/defense_finder$ defense-finder run -w 4 --db-type unordered --log-level WARNING --preserve-raw -o RF122_faaTest GCA_000009005.faa
Traceback (most recent call last):
  File "/home/JPegorino/.conda/envs/DefenseFinder/bin/defense-finder", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/defense_finder_cli/main.py", line 149, in run
    defense_finder_posttreat.run(tmp_dir, outdir, os.path.splitext(os.path.basename(filename))[0])
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/defense_finder_posttreat/__init__.py", line 11, in run
    bs = best_solution.get(tmp_dir)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/defense_finder_posttreat/best_solution.py", line 13, in get
    acc = acc + parse_best_solution(family_path)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/JPegorino/.conda/envs/DefenseFinder/lib/python3.12/site-packages/defense_finder_posttreat/best_solution.py", line 36, in parse_best_solution
    with open(os.path.join(dir, 'best_solution.tsv')) as tsv_file:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/JPegorino/work/CCs/defense_finder/RF122_faaTest/defense-finder-tmp/DF_3/best_solution.tsv'
github-actions[bot] commented 2 months ago

This issue has been inactive for 60 days and is now marked as stale. It will be closed in 7 days without further activity.