Closed devonorourke closed 3 years ago
The specific step that is executed, then is hanging, is this one:
/scratch/dro49/conda/envs/funenv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl \
-c /scratch/dro49/mysework/annotation/funruns/SEfun1/training/pasa/alignAssembly.txt \
-r -C -R -g /scratch/dro49/mysework/annotation/funruns/SEfun1/training/genome.fasta \
--IMPORT_CUSTOM_ALIGNMENTS /scratch/dro49/mysework/annotation/funruns/SEfun1/training/trinity.alignments.gff3 \
-T -t /scratch/dro49/mysework/annotation/funruns/SEfun1/training/trinity.fasta.clean \
-u /scratch/dro49/mysework/annotation/funruns/SEfun1/training/trinity.fasta --stringent_alignment_overlap 30.0 \
--TRANSDECODER --ALT_SPLICE --MAX_INTRON_LENGTH 10000 --CPU 12 --ALIGNERS blat --trans_gtf /scratch/dro49/mysework/annotation/funruns/SEfun1/training/funannotate_train.stringtie.gtf
Here's the weird part:
Switching up the max_intron_len
parameter to 10,000 worked for one of the two bat genomes. It completed through the entire train.py
script. The above command, when it completes successfully, takes about 6 hours. However, the job that is failing (I think) is taking more than 18 hours and hasn't proceeded to the next phase, which in my .log file looks something like:
[some date]: PASA assigned ...
I don't see that message in the job that is hanging, and when I look at node the job is running at, it seems like lots of processes are still open, but not really doing much:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
103791 dro49 20 0 14144 1908 976 R 0.7 0.0 0:00.05 top
63309 dro49 20 0 0 0 0 Z 0.0 0.0 0:00.68 pigz <defunct>
66916 dro49 20 0 0 0 0 Z 0.0 0.0 0:00.67 pigz <defunct>
70536 dro49 20 0 116m 6180 804 S 0.0 0.0 0:00.06 perl
80452 dro49 20 0 103m 1376 1136 S 0.0 0.0 0:00.00 sh
80453 dro49 20 0 3473m 2.3g 764 S 0.0 1.9 3:59.87 perl
103761 dro49 20 0 126m 2816 892 S 0.0 0.0 0:00.00 sshd
103762 dro49 20 0 104m 1964 1460 S 0.0 0.0 0:00.00 bash
126843 dro49 20 0 103m 1624 1244 S 0.0 0.0 0:00.05 slurm_script
127494 dro49 20 0 155m 29m 948 S 0.0 0.0 0:44.91 funannotate
127864 dro49 20 0 19452 920 624 S 0.0 0.0 0:00.00 pigz
127865 dro49 20 0 19452 920 624 S 0.0 0.0 0:00.00 pigz
So you are saying it seems that PASA is getting stuck? Anything different about these two species (config names, RNA-seq data, coverage, etc?). Could it have run out of memory? Hard to imagine how the same settings would result in one stalling and the other completing??
It hopefully is some dumb thing on my part. The job that PASA is getting stuck on is the bat genome I've already completed an initial run with, so definitely not anything wrong with fasta/fastq header names. I haven't picked up any OOM events from our cluster in any log files. For the moment, I'm just running the command that it didn't finish directly and will check out the log file for any hints there (hopefully something will come of this):
/scratch/dro49/conda/envs/funenv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl -c /scratch/dro49/myluwork/annotation/fun2/funR2/training/pasa/alignAssembly.txt -r -C -R -g /scratch/dro49/myluwork/annotation/fun2/funR2/training/genome.fasta --IMPORT_CUSTOM_ALIGNMENTS /scratch/dro49/myluwork/annotation/fun2/funR2/training/trinity.alignments.gff3 -T -t /scratch/dro49/myluwork/annotation/fun2/funR2/training/trinity.fasta.clean -u /scratch/dro49/myluwork/annotation/fun2/funR2/training/trinity.fasta --stringent_alignment_overlap 30.0 --TRANSDECODER --ALT_SPLICE --MAX_INTRON_LENGTH 10000 --CPU 24 --ALIGNERS blat --trans_gtf /scratch/dro49/myluwork/annotation/fun2/funR2/training/funannotate_train.stringtie.gtf
Thanks for all the replies!
You are running in a new folder correct? Ie it’s not trying to overwrite or add to existing SQLite database?
I've tried both iterations, first using an existing folder where I deleted just the existing pasa directory, and then also just rerunning 'train.py' anew. I'm on a third iteration now where I've tried it running with a bit more memory. It's still adding data to the "_building.ascii_illustrations.out" file within the pasa directory (after about 2 hours of starting the pasa-related contents). I'll keep you posted. This is is a weird one for sure...
On Thu, Apr 23, 2020 at 12:52 PM Jon Palmer notifications@github.com wrote:
You are running in a new folder correct? Ie it’s not trying to overwrite or add to existing SQLite database?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/413#issuecomment-618513898, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVKAXHYUIOBFRT3CVO5LR3ROBW6NANCNFSM4MOJVJSA .
-- Devon O'Rourke Postdoctoral researcher, Northern Arizona University Lab of Jeffrey T. Foster - https://fozlab.weebly.com/ twitter: @thesciencedork
The latest iteration finished completely, and I am now 100% confused. This is one of those things I'm 99.99% confident was a user error on my side, but I can't find any piece to point to that was the cause of it. Sorry to raise this issue, but in my estimation you can close it. I'm sorry to have bothered raising it in the first place, but perhaps if someone else comes across a similar behavior this post will at least offer some guidance: just delete and retry! (maybe bad guidance, but guidance nonetheless!)
Thanks Jon
Hi Jon, I wanted to switch up the
max_intron_len
parameter in the training/prediction parts from the default setting (3000) to a larger value (25000). However, when running the exact same set of commands that had previously completed in about 6 hours, the new job was going for about 18 hours when I looked at atop
command to notice that nothing was happening under the hood. I noticed in at least one other thread that others have modified this code and didn't see anyone raise an issue about it. I'm curious what troubleshooting steps you'd advise to investigate. Thanks