raw-lab / MetaCerberus

Python code for versatile Functional Ontology Assignments for Metagenomes searching via Hidden Markov Model (HMM) with environmental focus of shotgun metaomics data
BSD 3-Clause "New" or "Revised" License
48 stars 7 forks source link

Stuck with step_08-hmmer #12

Closed JSSaini closed 7 months ago

JSSaini commented 7 months ago

Hello, I ran this command to functionally annotate 553551 amino acids sequences.

metacerberus.py --amino SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.faa --hmm "KOFam_all, COG, VOG, PHROG, CAZy" --dir_out MG_553531_output_final3

However, after 1 day of running this job get killed. How can I check the log to see the problem? I see the output file and most of the work is done. How can I restart this job?

Following is the timestamp.tsv and list of files in step_08-hmmer. And I see that it is missing the final output? It also provides annotation with metabolism/pathways?

LBEUBUWKS-78    searchHMM       0:17:03.027626  ['CAZy-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531']
LBEUBUWKS-78    filterHMM       0:00:01.482673  Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531
LBEUBUWKS-78    searchHMM       3:58:08.692975  ['COG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531']
LBEUBUWKS-78    filterHMM       0:06:36.907495  Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531
LBEUBUWKS-78    searchHMM       10:45:27.882417 ['VOG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531']
LBEUBUWKS-78    filterHMM       0:00:31.614733  Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531
LBEUBUWKS-78    searchHMM       10:51:38.348634 ['PHROG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531']
LBEUBUWKS-78    filterHMM       0:01:13.635281  Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531
LBEUBUWKS-78    searchHMM       18:53:35.962929 ['KOFam_all-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531']
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 10:30 filtered-KOFam_all.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 10:30 filtered-KOFam_all.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 252M Mar 21 10:30 KOFam_all-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam  37M Mar 21 02:30 filtered.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 9.4M Mar 21 02:30 filtered-PHROG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 02:28 filtered-PHROG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam  14M Mar 21 02:28 PHROG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 5.1M Mar 21 02:23 filtered-VOG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 02:22 filtered-VOG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam 7.3M Mar 21 02:22 VOG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam  22M Mar 20 19:42 filtered-COG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 19:35 filtered-COG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam  70M Mar 20 19:35 COG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 635K Mar 20 15:54 filtered-CAZy.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:54 filtered-CAZy.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam 1.6M Mar 20 15:54 CAZy-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 CAZy-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 COG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 VOG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 KOFam_all-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 PHROG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
raw-lab commented 7 months ago

Odd. Can you send us a couple hundred lines of the file? This maybe something with the headers. It should only be less 15 mins for a file like this. How much RAM or CPUs are you giving your computer?

JSSaini commented 7 months ago

Apparently, it is done successfully now. It had a long pause where I was not able to track any activity.

-rw-rw-r-- 1 bioinfoteam bioinfoteam  57M Mar 21 12:04 filtered.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam  21M Mar 21 12:04 filtered-KOFam_all.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 10:30 filtered-KOFam_all.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam 252M Mar 21 10:30 KOFam_all-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 9.4M Mar 21 02:30 filtered-PHROG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 02:28 filtered-PHROG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam  14M Mar 21 02:28 PHROG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 5.1M Mar 21 02:23 filtered-VOG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 21 02:22 filtered-VOG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam 7.3M Mar 21 02:22 VOG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam  22M Mar 20 19:42 filtered-COG.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 19:35 filtered-COG.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam  70M Mar 20 19:35 COG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam 635K Mar 20 15:54 filtered-CAZy.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:54 filtered-CAZy.log
-rw-rw-r-- 1 bioinfoteam bioinfoteam 1.6M Mar 20 15:54 CAZy-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531.tsv
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 CAZy-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 COG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 VOG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 KOFam_all-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err
-rw-rw-r-- 1 bioinfoteam bioinfoteam    0 Mar 20 15:37 PHROG-Protein_SAIN23-1_MAGS_ASSEMBLY_SUBSET_553531_tmp.err

Step 9 and 10 output were also obtained. We have 32 CPUs and approx 200GB RAM by default. I didnt assign --cpu flag maybe that's why it was slow. Also I appreciate your prompt assistance, and thank you for the great tool.