mpdunne / orthofiller

OrthoFiller: Identifying missing annotations for evolutionarily conserved genes.
GNU General Public License v3.0
22 stars 1 forks source link

Program crashes but ghosh jobs keep going on #5

Open MatteoSchiavinato opened 7 years ago

MatteoSchiavinato commented 7 years ago

I recently ran an OrthoFiller process with 28 cores which crashed after some time for some input file problem. I had also time to track the timing of the program, which produces an output when the node considers the job as terminated (correctly or not). However, after 30-40 minutes, the ghost jobs were still visible with top (using 0% cpu and 0% of the RAM).

mpdunne commented 7 years ago

Hi Matteo,

What was the input problem, and were you able to fix it?

Thanks,

Michael

From: Matteo Schiavinato [mailto:notifications@github.com] Sent: 25 April 2017 15:59 To: mpdunne/orthofiller orthofiller@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [mpdunne/orthofiller] Program crashes but ghosh jobs keep going on (#5)

I recently ran an OrthoFiller process with 28 cores which crashed after some time for some input file problem. However, after 30-40 minutes, the ghost jobs were still visible with top (using 0% cpu and 0% of the RAM).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mpdunne/orthofiller/issues/5, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGCT1-wf8Rl6wIgZAHrSoGIOton8BrKDks5rzgpFgaJpZM4NHnR8.

MatteoSchiavinato commented 7 years ago

What was the input problem, and were you able to fix it?

Nothing script-related, I didn't index the FASTA files. I was more concerned with finding 28 ghost processes with top after 30 minutes that it had crashed!

mpdunne commented 7 years ago

Okay thanks, I’ll look into that! If it’s something I can get the script to check for at the beginning, that would save time waiting for a failure: it would also mean I would have a better chance of exiting the program gracefully and dealing directly with those processes.

From: Matteo Schiavinato [mailto:notifications@github.com] Sent: 26 April 2017 13:27 To: mpdunne/orthofiller orthofiller@noreply.github.com Cc: Michael Dunne mpdunne@live.com; Comment comment@noreply.github.com Subject: Re: [mpdunne/orthofiller] Program crashes but ghosh jobs keep going on (#5)

What was the input problem, and were you able to fix it? Nothing script-related, I didn't index the FASTA files. I was more concerned with finding 28 ghost processes with top after 30 minutes that it had crashed!

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mpdunne/orthofiller/issues/5#issuecomment-297390045, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGCT13Jb6UM02rO5uaWw2nLTNqyLnLL7ks5rzzgUgaJpZM4NHnR8.

MatteoSchiavinato commented 7 years ago

What was the input problem, and were you able to fix it?

I circumscribed the problem: I have a GTF file where cds sequences are not all multiple of 3, one where some coordinates are duplicated, and one where some of the coordinates are not found in the FASTA file. The multiple of 3 seems to be the last warning arising, so probably the reason why it gets stuck.

This time I didn't have any time output so the process is still ongoing. However, none of the 20 cores is using cpu or ram, they seem idle!

MatteoSchiavinato commented 7 years ago

Follow up on this topic:

If the tests on the consistency of the files are passed, the program continues normally and (if crashing) there will be no ghost processes later on. If the tests are not passed, then the threads stay up as ghosts without doing anything. Maybe the .join() function is not invoked yet?

mpdunne commented 7 years ago

Hi Matteo,

I've looked into the issue with Ghost processes and this should be fixed in the next update.

All the best,

Michael