ebi-pf-team / interproscan

Genome-scale protein function classification
Apache License 2.0
303 stars 67 forks source link

mobidb.py failed, the error message could have been better #50

Closed wbazant closed 6 years ago

wbazant commented 6 years ago

I'm running InterPro scan for the next release of Wormbase ParaSite. The pipeline choked on our input:

python bin/mobidb/mobidb-lite.py -a 64 -t 1 -bin /nfs/software/ensembl/RHEL7-JUL2017-core2/linuxbrew/Cellar/interproscan/5.27-66.0/bin/mobidb/binx in.fasta
Error output from binary:
Traceback (most recent call last):
  File "bin/mobidb/mobidb-lite.py", line 455, in <module>
    out = run_mobidb(binDirectory, args.threads, args.longOutput, acc, seq, args.architecture, verbose = args.verbose)
  File "bin/mobidb/mobidb-lite.py", line 198, in run_mobidb
    if ele.get():
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get
    raise self._value
OSError: [Errno 7] Argument list too long

The input file contained a bad protein - it had about 3000 X's at the end, and then an M. I've uploaded it here: https://gist.github.com/wbazant/fdd29d04c6d205b8f7b59bc462e8c8ee

The error message didn't help me, and I discovered what the problem is essentially by luck - and removing the X's made it all good again. If you endavour to support finding features in proteins with thousands of X's you could improve the error message that happens.

gsn7 commented 6 years ago

thanks for this info. I will forward this to the mobidb developers

wbazant commented 6 years ago

I ran into this issue again. @gsn7 What did the developers of mobidb say about this?

Should I continue to not expect that interproscan will reliably work for me?

gsn7 commented 6 years ago

this has been fixed by mobidb. our next release this week will have the fix

gsn7 commented 6 years ago

the release is out, so this issue is fixed