metagentools / MetaCoAG

🚦🧬 Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs
https://metacoag.readthedocs.io/en/stable/
GNU General Public License v3.0
57 stars 5 forks source link

Error while running MetaCoag; wrong hmm file location? #37

Closed ZarulHanifah closed 1 year ago

ZarulHanifah commented 1 year ago

Hey MetaCOAG developers,

Thank you for the software.

I got this error:

2023-07-22 13:15:26,217 - INFO - Welcome to MetaCoAG: Binning Metagenomic Contigs via Composition, Coverage and 
Assembly Graphs.
2023-07-22 13:15:26,218 - INFO - Input arguments: 
2023-07-22 13:15:26,218 - INFO - Assembler used: spades
2023-07-22 13:15:26,218 - INFO - Contigs file: input_folder/assemblies/metaspades/H4B.fasta
2023-07-22 13:15:26,219 - INFO - Assembly graph file: input_folder/graphs/H4B.gfa
2023-07-22 13:15:26,219 - INFO - Contig paths file: input_folder/paths/metaspades/H4B.paths
2023-07-22 13:15:26,219 - INFO - Abundance file: /home/mzar0002/ps45_scratch/Zarul/am/workflow/bin_metagenomes/r
esults/metacoag/tmp_abund/H4B.txt
2023-07-22 13:15:26,219 - INFO - Final binning output file: /home/mzar0002/ps45_scratch/Zarul/am/workflow/bin_me
tagenomes/results/metacoag/indassembly/H4B
2023-07-22 13:15:26,219 - INFO - Marker gene file hmm: /fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag_utils/auxiliary/marker.hmm
2023-07-22 13:15:26,219 - INFO - Minimum length of contigs to consider: 1000
2023-07-22 13:15:26,220 - INFO - Depth to consider for label propagation: 10
2023-07-22 13:15:26,220 - INFO - p_intra: 0.1
2023-07-22 13:15:26,220 - INFO - p_inter: 0.01
2023-07-22 13:15:26,220 - INFO - Do not use --cut_tc: False
2023-07-22 13:15:26,220 - INFO - mg_threshold: 0.5
2023-07-22 13:15:26,220 - INFO - bin_mg_threshold: 0.33333
2023-07-22 13:15:26,220 - INFO - min_bin_size: 200000 base pairs
2023-07-22 13:15:26,220 - INFO - d_limit: 20
2023-07-22 13:15:26,221 - INFO - Number of threads: 8
2023-07-22 13:15:26,221 - INFO - MetaCoAG started
2023-07-22 13:16:29,786 - INFO - Total number of contigs available: 4449432
2023-07-22 13:17:08,128 - INFO - Total number of edges in the assembly graph: 386404
2023-07-22 13:17:10,102 - INFO - Total isolated contigs in the assembly graph: 4136263
2023-07-22 13:17:10,102 - INFO - Obtaining lengths and coverage values of contigs
2023-07-22 13:17:41,057 - INFO - Total long contigs: 148229
2023-07-22 13:17:41,057 - INFO - Total isolated long contigs in the assembly graph: 83309
2023-07-22 13:17:41,057 - INFO - Obtaining tetranucleotide frequencies of contigs
2023-07-22 13:24:00,915 - INFO - Scanning for single-copy marker genes
2023-07-22 13:24:00,950 - INFO - Obtaining hmmout file
2023-07-22 13:24:00,950 - INFO - Using marker file: /fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag_utils/auxiliary/marker.hmm
2023-07-22 14:19:01,815 - INFO - Obtaining contigs with single-copy marker genes
Traceback (most recent call last):
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag", line 1258, in <module>
    main()
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag", line 611, in main
    ) = marker_gene_utils.get_contigs_with_marker_genes(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/metacoag_utils/marker_gene_utils.py", line 121, in get_contigs_with_marker_genes
    with open(f"{contigs_file}.hmmout", "r") as myfile:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'input_folder/assemblies/metaspades/H4B.fasta.hmmout'

Suddenly there are files created where I put my assembly:

input_folder/assemblies/metaspades/H4B.fasta                 # This is my assembly
input_folder/assemblies/metaspades/H4B.fasta.frag.err    # I didn't know how these got here.
input_folder/assemblies/metaspades/H4B.fasta.frag.faa
input_folder/assemblies/metaspades/H4B.fasta.frag.ffn
input_folder/assemblies/metaspades/H4B.fasta.frag.gff
input_folder/assemblies/metaspades/H4B.fasta.frag.out
input_folder/assemblies/metaspades/H4B.fasta.hmmout.err
input_folder/assemblies/metaspades/H4B.fasta.hmmout.out

I had a look at the file input_folder/assemblies/metaspades/H4B.fasta.hmmout.err, it says:

Error: File existence/permissions problem in trying to open HMM file /fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag_utils/auxiliary/marker.hmm.
HMM file /fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/bin/metacoag_utils/aux

When I looked further into the env, turns out the folder metacoag_utils was actually in a different folder. It is not bin, it is supposed to be in lib/python3.11/site-packages/:

/fs03/ps45/Zarul/am/workflow/bin_metagenomes/.snakemake/conda/8e7934419093f10c3f8f0e3ad0be2083_/lib/python3.11/site-packages/metacoag_utils/
├── auxiliary
│   └── marker.hmm
├── bidirectionalmap.py
├── feature_utils.py
├── graph_utils.py
├── __init__.py
├── label_prop_utils.py
├── marker_gene_utils.py
├── matching_utils.py
├── __pycache__
│   ├── bidirectionalmap.cpython-311.pyc
│   ├── feature_utils.cpython-311.pyc
│   ├── graph_utils.cpython-311.pyc
│   ├── __init__.cpython-311.pyc
│   ├── label_prop_utils.cpython-311.pyc
│   ├── marker_gene_utils.cpython-311.pyc
│   └── matching_utils.cpython-311.pyc
└── support
    ├── combine_cov.py
    └── __pycache__
        └── combine_cov.cpython-311.pyc

The command I executed was:

metacoag --assembler spades \
--graph input_folder/graphs/H4B.gfa \
--contigs input_folder/assemblies/metaspades/H4B.fasta \
--paths input_folder/paths/metaspades/H4B.paths \
--abundance /home/mzar0002/ps45_scratch/Zarul/am/workflow/bin_metagenomes/results/metacoag/tmp_abund/H4B.txt \
--output /home/mzar0002/ps45_scratch/Zarul/am/workflow/bin_metagenomes/results/metacoag/indassembly/H4B 2> /home/mzar0002/ps45_scratch/Zarul/am/workflow/bin_metagenomes/results/log/metacoag_indassembly/H4B.log

The metacoag I installed was from conda.

Thank you.

Vini2 commented 1 year ago

Hello @ZarulHanifah,

Thanks for posting this error. I understand the issue. I will add a fix ASAP.

Vini2 commented 1 year ago

Hello @ZarulHanifah,

I've fixed the issue and added a new release on both bioconda and PyPI. Please give it a try and let me know if the problem still persists.

Thanks!

ZarulHanifah commented 1 year ago

Hello @Vini2 , it works well! Thank you very much!