apcamargo / genomad

geNomad: Identification of mobile genetic elements
https://portal.nersc.gov/genomad/
Other
169 stars 17 forks source link

Error with geNomad #88

Closed xjhzjucas closed 2 months ago

xjhzjucas commented 2 months ago

Hi developer, I met a problem when using end-to-end command.

[00:25:20] Executing genomad find-proviruses.
[00:25:20] Creating the genomad_output/combined5w.part_027_find_proviruses
           directory.
[00:40:58] Integrases identified with MMseqs2 and geNomad database (v1.6) were
           written to combined5w.part_027_provirus_mmseqs2.tsv.
[00:40:58] Deleting combined5w.part_027_provirus_mmseqs2.
[00:40:59] Deleting combined5w.part_027_provirus_mmseqs2_input.faa.
Traceback (most recent call last):
  File "/home/xujinghong/miniconda3/envs/genomad/bin/genomad", line 10, in <module>
    sys.exit(cli())
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/rich_click/rich_command.py", line 126, in main
    rv = self.invoke(ctx)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/genomad/cli.py", line 1254, in end_to_end
    ctx.invoke(
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/genomad/cli.py", line 600, in find_proviruses
    genomad.find_proviruses.main(
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/genomad/modules/find_proviruses.py", line 656, in main
    aragorn_obj.run_parallel_aragorn(threads)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/genomad/aragorn.py", line 96, in run_parallel_aragorn
    self._append_aragorn_tsv(current_file_path)
  File "/home/xujinghong/miniconda3/envs/genomad/lib/python3.10/site-packages/genomad/aragorn.py", line 45, in _append_aragorn_tsv
    current_contig = line[1:].strip().split()[0]
IndexError: list index out of range

I think it maybe caused by the incorrect input file because when I put another fasta file into geNomad, it run successfully

apcamargo commented 2 months ago

Can you share the problematic input file?

xjhzjucas commented 2 months ago

Sorry I tried but the file seems to big to be uploaded(750MB)

xjhzjucas commented 2 months ago

Hi developer, may I email the file to you?

apcamargo commented 2 months ago

Sure. You can find my email the manuscript. But I don't think you'll be able to attach this file to a email anyway. Can you upload it to Google Drive, One Drive, or something similar?

apcamargo commented 2 months ago

@xjhzjucas I found the issue. It's a problem with your input. Here's the problematic part:

GAAAAACAACCCATTGTTTTTTCATTGCATGAGTGTCGTTTTTTTGGGGGGGCGAAGCG>
MAG-0278.fa_MAG-0278__contig_k141_427833_1TAGAATCTGTCCTCGAAA
AGATAATTGGTTACTAATTTATAGAACATACCTAAAACTATGAATGTTGTAAAGTTCGTA
xjhzjucas commented 2 months ago

Thank you for your kind help! I used seqkit tool to split a huge file but it seems this subfile met some problem like this, I will manually adjust it. Thank you!

apcamargo commented 2 months ago

No worries! Was the FASTA correct before you split? If you think that this issue was caused by seqkit, I suggest you open up an issue in its repository. I use seqkit split2 myself all the time and never had this issue.