microbiomedata / metaMAGs

Workflow for metagenome assembled genomes generation.
5 stars 4 forks source link

Error in package tasks #26

Closed scanon closed 5 days ago

scanon commented 4 months ago

We are seeing this error in some runs...

Dumping /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/ab68c879-fcb8-412f-af79-8db74291bb31/call-package/execution/stderr
Traceback (most recent call last):
  File "/opt/conda/envs/mags_vis/bin/create_tarfiles.py", line 174, in <module>
    krona_plot(ko_result,prefix)
  File "/opt/conda/envs/mags_vis/bin/create_tarfiles.py", line 127, in krona_plot
    df = pd.read_csv(ko_result,sep="\t")
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 482, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
    self._engine = self._make_engine(self.engine)
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in __init__
    self._open_handles(src, kwds)
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/parsers/base_parser.py", line 229, in _open_handles
    errors=kwds.get("encoding_errors", "strict"),
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/common.py", line 614, in get_handle
    storage_options=storage_options,
  File "/opt/conda/envs/mags_vis/lib/python3.7/site-packages/pandas/io/common.py", line 396, in _get_filepath_or_buffer
    raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'NoneType'>

I've created a snapshot for testing in /global/cfs/cdirs/m3408/squads/mags/package_failure on perlmutter.

chienchi commented 3 months ago

The error indicate there is no ko analysis result. After digging up the reason and found the contig names are not matched between contigs.fasta and the annotation files.

$ head /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/ab68c879-fcb8-412f-af79-8db74291bb31/call-stage/cacheCopy/execution/contigs.fasta
>scaffold_1_c1
TACTTTTTGGAAGTACACGCCACGCTTAAAGCAGGCTTGCCCTAGTATTTGAAAGAACTG
ACCCCAGCCCGCATCCAAGCAATGTTTTCCTAACATCGCCCTAGTTAAACCCACCAGGTT

$ head /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/ab68c879-fcb8-412f-af79-8db74291bb31/call-stage/cacheCopy/execution/ko.tsv 
nmdc:wfmgan-11-365wd896.1_1_c1_1_981    2505168920  KO:K07496   85.93   1   327 1   327 6.7e-213    669 327
nmdc:wfmgan-11-365wd896.1_1_c1_9003_9578    2914016940  KO:K05795   84.66   1   184 1   189 9.3e-109    359 184
nmdc:wfmgan-11-365wd896.1_1_c1_11210_12415  2505168920  KO:K07496   86.28   1   401 1   401 4.2e-264    823 401
nmdc:wfmgan-11-365wd896.1_1_c1_18884_20020  2883227922  KO:K02338   87.60   1   378 1   379 3.4e-235    738 378
nmdc:wfmgan-11-365wd896.1_1_c1_25306_25899  2993829757  KO:K03273   37.43   1   177 5   179 7.6e-34 144 177
n

If the contigs get from NMDC assembly workflow and the annotation get from NMDC annotation workflow, the IDs should match, right?

aclum commented 3 months ago

This can happen if the assembly is from JGI and then NMDC runs the annotation. We need to fix this first https://github.com/microbiomedata/mg_annotation/issues/24 before we can process these.

chienchi commented 1 week ago

We had a project with an error on NMDC EDGE where there is no barplot.pdf been generated. A barplot.pdf file check condition in the package task should be added, not just checking the heatmap.pdf only..

        if [ -f ~{prefix}_heatmap.pdf ]; then
            echo "KO analysis plot exists."
        else
            echo "No KO analysis result for ~{proj}" > ~{prefix}_heatmap.pdf
            echo "No KO analysis result for ~{proj}" > ~{prefix}_barplot.pdf
            echo "No KO analysis result for ~{proj}" > ~{prefix}_ko_krona.html
            echo "No KO analysis result for ~{proj}" > ~{prefix}_module_completeness.tab
        fi
aclum commented 1 week ago

I don't think the idea of writing text to files that are pdf, html, tab. Is there another way to do this?