ebi-pf-team / interproscan

Genome-scale protein function classification
Apache License 2.0
292 stars 67 forks source link

Interproscan freezes writing xml #371

Closed ChuChuChaddy closed 1 month ago

ChuChuChaddy commented 3 months ago

Hey all, I'm using interproscan to annotate a genome as part of the funannotate pipeline. I'm using 5.68 with the command:

interproscan.sh -cpu 32 -t n -f xml -iprlookup -goterms -pa -verbose \ -i PA.scaffolds.fa -o PA.n.xml

Here is the end of the verbose output:

27/06/2024 15:09:17:654 thread#: 43 Processing MobiDBStoreFilteredMatches JobNo #: 1388 - stepInstanceId = 340 [5001-10000] 27/06/2024 15:09:18:724 Models in this batch: 122 27/06/2024 15:09:18:785 Execution Time (ms) for job started 27/06/2024 15:09:17:460 JobNo #: 1386 stepName: CathFunFamParseOutputs [4001-8000] time: 1325 27/06/2024 15:09:19:027 thread#: 55 Processing CathFunFamDeleteRelatedTmpFiles JobNo #: 1389 - stepInstanceId = 213 [4001-8000] 27/06/2024 15:09:19:028 Execution Time (ms) for job started 27/06/2024 15:09:19:027 JobNo #: 1389 stepName: CathFunFamDeleteRelatedTmpFiles [4001-8000] time: 1 27/06/2024 15:10:57:442 Execution Time (ms) for job started 27/06/2024 15:09:17:654 JobNo #: 1388 stepName: MobiDBStoreFilteredMatches [5001-10000] time: 99788 27/06/2024 15:10:57:588 thread#: 58 Processing MobiDBDeleteJobFiles JobNo #: 1390 - stepInstanceId = 906 [5001-10000] 27/06/2024 15:10:57:588 Execution Time (ms) for job started 27/06/2024 15:10:57:588 JobNo #: 1390 stepName: MobiDBDeleteJobFiles [5001-10000] time: 0 27/06/2024 15:10:57:874 thread#: 67 Processing PrepareForOutput JobNo #: 1391 - stepInstanceId = 37 [1-20430] 27/06/2024 15:10:57:879 How many executions of this Step have we done before including this Step: 1 27/06/2024 15:14:29:208 Info: [1_20430] pcounts tryCounts:0 maxTryCount:0 maxtotalWaitTime: 0 27/06/2024 15:14:29:208 Info: [1_20430] mcounts tryCounts:0 maxTryCount:0 maxtotalWaitTime: 0 27/06/2024 16:19:54:936 Execution Time (ms) for job started 27/06/2024 15:10:57:874 JobNo #: 1391 stepName: PrepareForOutput [1-20430] time: 4137062 27/06/2024 16:19:55:128 thread#: 63 Processing WriteOutput JobNo #: 1392 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:20:08:132 Writing out XML output 27/06/2024 16:20:43:344 thread#: 63 Processing WriteOutput JobNo #: 1393 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:20:56:345 Writing out XML output 27/06/2024 16:21:37:686 thread#: 63 Processing WriteOutput JobNo #: 1394 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:21:50:687 Writing out XML output 27/06/2024 16:22:19:356 thread#: 63 Processing WriteOutput JobNo #: 1395 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:22:32:357 Writing out XML output 27/06/2024 16:22:57:340 thread#: 63 Processing WriteOutput JobNo #: 1396 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:23:10:341 Writing out XML output 27/06/2024 16:23:37:137 thread#: 63 Processing WriteOutput JobNo #: 1397 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:23:50:138 Writing out XML output 27/06/2024 16:24:17:746 thread#: 63 Processing WriteOutput JobNo #: 1398 - stepInstanceId = 1392 [1-20430] 27/06/2024 16:24:30:747 Writing out XML output

It gets frozen at "Writing out XML output." Is there anything I can do to help it complete the job? The file doesn't grow any larger, even be iterations of "Write output." Any help would be appreciated!

matthiasblum commented 1 month ago

Hi @ChuChuChaddy,

Sorry for the late reply.

How many nucleotide sequences are in PA.scaffolds.fa? InterProScan doesn't fare well when processing very large number of sequences, especially if they are DNA/RNA sequences.