jessieren / DeepVirFinder

Identifying viruses from metagenomic data by deep learning
Other
116 stars 32 forks source link

ValueError: not enough values to unpack #23

Closed alexmsalmeida closed 3 years ago

alexmsalmeida commented 3 years ago

Hi,

Thanks for putting together DeepVirFinder.

However, I am having trouble processing one particular FASTA file. I am getting the error below.

2. Encoding and Predicting Sequences.
   processing line 1
   processing line 7014
Traceback (most recent call last):
  File "/nfs/production/interpro/metagenomics/mags-scripts/dependencies/DeepVirFinder/dvf.py", line 212, in <module>
    head, score, pvalue = zip(*pool.map(pred, range(0, len(code))))
ValueError: not enough values to unpack (expected 3, got 0)

There is nothing obviously wrong with the FASTA file (I ran it with VirSorter2 and VIBRANT and had no issues). Any ideas on what could be the problem here?

Many thanks in advance.

Best, Alex

jessieren commented 3 years ago

Hi Alex,

Thanks for using DeepVirFinder.

To help me debug, could you print the input and output of line 212?

print(code) out = zip(*pool.map(pred, range(0, len(code)))) print(out)

See what you get? Thanks.

Jie

alexmsalmeida commented 3 years ago

Hi Jie,

Thanks for the quick reply. After playing around with it some more I actually figured it was due to the length filtering I was putting it. It was too high so none of the contigs were passing through.

Thanks again, Alex

EarlyEvol commented 2 years ago

If you have a fasta with many small seqs at the end, dvf.py will throw this error. Basically if no sequences pass the size filter in the 100 sequence chunk at the end of the fasta file, the "code" variable will be empty, but pred will still get called. I added an if statement (line 209 in my version) that fixed the issue.

if len(code) >= 1:
           print("   processing line "+str(lineNum))
           pool = multiprocessing.Pool(core_num)
           head, score, pvalue = zip(*pool.map(pred, range(0, len(code))))