geronimp / graftM

GraftM - Rapid community profiles from metagenomes
http://geronimp.github.io/graftM/
GNU General Public License v3.0
44 stars 16 forks source link

KeyError on split sequence? #255

Open maxemil opened 6 years ago

maxemil commented 6 years ago

Hej, I am trying to analyse a fastq file with a custom gpkg but get an error for a specific read, that seems to be split somehow and then later not found in the original data or something like that i guess. Do you have a suggestion as to what is happening there? I also attach the sequence, the gpkg and the output so you can reproduce the issue

Traceback (most recent call last):
  File "/usr/local/bin/graftM", line 409, in <module>
Traceback (most recent call last):
  File "/usr/local/bin/graftM", line 409, in <module>
    Run(args).main()
    Run(args).main()
  File "/usr/local/lib/python2.7/dist-packages/graftm/run.py", line 588, in main
  File "/usr/local/lib/python2.7/dist-packages/graftm/run.py", line 588, in main
    self.graft()
  File "/usr/local/lib/python2.7/dist-packages/graftm/run.py", line 377, in graft
    self.graft()
  File "/usr/local/lib/python2.7/dist-packages/graftm/run.py", line 377, in graft
    diamond_db
  File "/usr/local/lib/python2.7/dist-packages/graftm/timeit.py", line 10, in timed
    diamond_db
  File "/usr/local/lib/python2.7/dist-packages/graftm/timeit.py", line 10, in timed
    result = method(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 822, in aa_db_search
    result = method(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 822, in aa_db_search
    hit_reads_orfs_fasta)
    hit_reads_orfs_fasta)
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 951, in search_and_extract_orfs_matching_protein_database
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 951, in search_and_extract_orfs_matching_protein_database
    SequenceSearchResult.QUERY_TO_FIELD])
    SequenceSearchResult.QUERY_TO_FIELD])
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 608, in _extract_orfs
  File "/usr/local/lib/python2.7/dist-packages/graftm/sequence_searcher.py", line 608, in _extract_orfs
    entry=sequence_frame_info_dict[record.id]
    entry=sequence_frame_info_dict[record.id]
KeyErrorKeyError: 'SRR3656745.9234541.2_split_1'
: 'SRR3656745.9234541.2_split_1'

graftM_error_report.tar.gz