Closed DGleason-680 closed 4 months ago
I have been running a sample set (i.e., paired-end reads) using MetaPro through HPC and the job failed just after 72 hours with the following message at the end of the sample_err.txt file generated:
GeneMark.hmm 400-day license. License key "/home/dvan/.gm_key" not found. This file is neccessary in order to use GeneMark.hmm. SPADES_ok_MGM_fail
I have a MetaGeneMark license key in my /home/dvan directory which is labeled as: gm_key_64 Is this label okay to use? Do I just need to change my path indicated in my config file? I had left the path as "MetaGeneMark_model = /pipeline_tools/mgm/MetaGeneMark_v1.mod" based on the config example in the tutorial:
MetaGeneMark_model: /pipeline_tools/mgm/MetaGeneMark_v1.mod #(This is in the container already. do not alter)
Some guidance on this issue would be greatly appreciated. I'm hoping this is the last issue I run into before actually obtaining some useful results from a successful run.
The following is the database section of my config.ini file:
[Databases] database_path = /home/dvan/scratch/MetaPro/dbs/ UniVec_Core = %(database_path)s/univec_core/UniVec_Core.fasta Adapter = %(database_path)s/trimmomatic_adapters/TruSeq3-PE-2.fa Rfam = %(database_path)s/Rfam/Rfam.cm DNA_DB = %(database_path)s/chocophlan_1/chocophlan_full.fasta DNA_DB_Split = %(database_path)s/chocophlan_split/ Prot_DB = %(database_path)s/nr/nr Prot_DB_reads = %(database_path)s/nr/nr accession2taxid = %(database_path)s/accession2taxid/accession2taxid nodes = %(database_path)s/WEVOTE_db/nodes_wevote.dmp names = %(database_path)s/WEVOTE_db/names_wevote.dmp Kaiju_db = %(database_path)s/kaiju_mine/kaiju_db_nr.fmi Centrifuge_db = %(database_path)s/centrifuge_db/nt SWISS_PROT = %(database_path)s/swiss_prot_db/ SWISS_PROT_map = %(database_path)s/swiss_prot_db/SwissProt_EC_Mapping.tsv PriamDB = %(database_path)s/PRIAM_db/ DetectDB = %(database_path)s/DETECTv2/ WEVOTEDB = %(database_path)s/WEVOTE_db/ taxid_tree = %(database_path)s/taxid_trees/class_tree.tsv kraken2_db = %(database_path)s/kraken2_db/ EC_pathway = %(database_path)s/EC_pathway/EC_pathway.txt path_to_superpath = %(database_path)s/path_to_superpath/pathway_to_superpathway.csv MetaGeneMark_model = /pipeline_tools/mgm/MetaGeneMark_v1.mod
Thank you, but it is not clear to me what needs to be done for the license key in my situation.
I have a MetaGeneMark license key in my home directory (/home/dvan). Is it possible that it's labeled incorrectly? It is currently "gm_key_64".
place the key as /home/dvan/.gm_key_64
note the "."
That seemed to work - thank you.
However, I'm still not having successful runs on the pipeline. Every time I rerun the same sample set (i.e., paired-end reads), the job fails at different stages of the pipeline. The last attempt ran for about 2 days then failed, with a return message in the sample_out.txt file:
/home/dvan/scratch/project1/output/024-010B/rRNA_filter/data/jobs/pair_1_363_infernal_pp not found. kill the pipe. restart this stage
Any suggestions to remedy this issue?
Looking forward to any recommendations on how to get a successful run through the pipeline.
the thing about the rRNA filter step is that we shard the reads, and send them all through infernal, hoping that your system has enough cores to make this step as painless as possible. But because there's a bunch of pieces flying around, we need to do some error-checking. This error says that pair_1, slice 363's infernal post-processing step failed to materialize anything.
a few things you can do: 1) check to see if this is a false positive error. If it didn't create something useful, re-run that specific sub-segment manually. If it did finish successfully, see <2> 2) bypass this error by adding in a job marker. jobs folder creates a bunch of empty files with specific names. we use these to track whether or not a parallel job has been completed.
I re-ran the job and now I'm back to having issues with the MGM license. Two jobs failed after ~80 hours because:
MGM did not produce a report. likely it didn't run
SPADes ran fine, but MGM failed. Check your MetaGeneMark license
I can confirm that I have ".gm_key_64" in my home directory.
$ cat .gm_key_64
AGATCAGACGAATCCACGAGGTACCCTACGTATGTTTTTTTTTTTTTTTTCACAGGCGCCCTTCAGATTCGGACGCCCCC
437719055
Perhaps it is the wrong license? Any support on this would be great and very appreciated so I can get some samples processed.
Also, is this an indication that the job was nearly complete? Is the MGM license part near the end of the workflow?
It's at the end of the spades run. Mgm is used to split the contigs into segments with single genes.
Did you make sure your docker bind mount included the home directory where your mgm key should be?
This is from my job submission script:
apptainer exec -B /home:/home $image python3 /pipeline/MetaPro.py --nhost -c $config -1 $read1 -2 $read2 --verbose_mode leave -o $output
However, my own directory is actually /home/dvan - and this is where the MGM license is located. So I will make this change (i.e., to "-B /home/dvan:/home") and hope it works. Thanks.
It's important to note that the MGM key has to be labelled ".gm_key".
I have been running a sample set (i.e., paired-end reads) using MetaPro through HPC and the job failed just after 72 hours with the following message at the end of the sample_err.txt file generated:
I have a MetaGeneMark license key in my /home/dvan directory which is labeled as: gm_key_64 Is this label okay to use? Do I just need to change my path indicated in my config file? I had left the path as "MetaGeneMark_model = /pipeline_tools/mgm/MetaGeneMark_v1.mod" based on the config example in the tutorial:
MetaGeneMark_model: /pipeline_tools/mgm/MetaGeneMark_v1.mod #(This is in the container already. do not alter)
Some guidance on this issue would be greatly appreciated. I'm hoping this is the last issue I run into before actually obtaining some useful results from a successful run.
The following is the database section of my config.ini file: