Open assetdaniyarov opened 2 years ago
(mop2) prom@PC48A067:/path/mop2/MOP2/mop_mod$ /path/mop2/nextflow run /path/mop2/MOP2/mop_mod/mop_mod.nf -with-singularity -profile standard
N E X T F L O W ~ version 22.10.2
Launching `/path/mop2/MOP2/mop_mod/mop_mod.nf` [naughty_bassi] DSL2 - revision: 83320fa996
╔╦╗╔═╗╔═╗ ╔╦╗┌─┐┌┬┐
║║║║ ║╠═╝ ║║║│ │ ││
╩ ╩╚═╝╩ ╩ ╩└─┘─┴┘
====================================================
BIOCORE@CRG Master of Pores 2. Detection of RNA modification - N F ~ version 2.0
====================================================
***************** Input files *******************
input_path : /path/mop2/MOP2/mop_preprocess/output_1_1_fast5/
comparison : /path/mop2/MOP2/mop_mod/comparison.tsv
********** reference has to be the genome *************
reference : /data/PublicData/refSeq_transcripts/GRCh38_latest_rna.fna
output : /path/mop2/MOP2/mop_mod/output_mod
pars_tools : /path/mop2/MOP2/mop_mod/tools_opt.tsv
************************* Flows *******************************
epinano : YES
nanocompore : NO
tombo_lsc : YES
tombo_msc : YES
email :
Skipping the email
executor > local (19995)
[61/36fba8] process > checkRef (Checking GRCh38_latest_rna.fna) [100%] 1 of 1 ✔
[c0/d1417d] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔
[b1/21ecdc] process > epinano_flow:splitBams (Splitting of wt_s.bam on pieces999.fa) [100%] 13320 of 13320 ✔
[96/675a4a] process > epinano_flow:indexReference (Indexing pieces999.fa) [100%] 6660 of 6660 ✔
[7b/e1b34a] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (wt___pieces1000_s.bam on wt) [ 0%] 0 of 13320
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
[- ] process > epinano_flow:makeEpinanoPlots_mis -
[- ] process > epinano_flow:makeEpinanoPlots_del -
[7b/56cd78] process > tombo_common_flow:multiToSingleFast5 (wt___PAI52977_pass_d5246539_0) [100%] 2 of 2 ✔
[5d/6c2b2c] process > tombo_common_flow:TOMBO_RESQUIGGLE_RNA:resquiggle_rna (mod___PAI53910_pass_e5b1... [100%] 2 of 2 ✔
[70/ecd689] process > getChromInfo (reference.fa) [100%] 1 of 1 ✔
[1c/0082b5] process > tombo_msc_flow:TOMBO_GET_MODIFICATION_MSC:getModificationsWithModelSampleCompar... [100%] 1 of 1 ✔
[- ] process > bedGraphToWig_msc [ 0%] 0 of 4
[75/9ce51c] process > tombo_lsc_flow:TOMBO_GET_MODIFICATION_LSC:getModificationsWithLevelSampleCompar... [100%] 1 of 1 ✔
[- ] process > bedGraphToWig_lsc [ 0%] 0 of 4
[- ] process > wigToBigWig [ 0%] 0 of 4
[- ] process > mergeTomboWigsPlus -
[- ] process > mergeTomboWigsMinus -
[01/de5629] process > EPINANO_VER:getVersion [100%] 1 of 1 ✔
[32/7baa16] process > NANOPOLISH_VER:getVersion [100%] 1 of 1 ✔
[d6/512a0f] process > NANOCOMPORE_VER:getVersion [100%] 1 of 1 ✔
[7e/1a182b] process > TOMBO_VER:getVersion [100%] 1 of 1 ✔
Error executing process > 'epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (wt___pieces01_s.bam on wt)'
Caused by:
Process exceeded running time limit (1d 6h)
Command executed:
Epinano_Variants.py -n 50 -R pieces01.fa -b wt___pieces01_s.bam -s $SAM2TSV --type t
for i in *.csv; do gzip $i; done
Command exit status:
-
Command output:
(empty)
executor > local (19995)
[61/36fba8] process > checkRef (Checking GRCh38_latest_rna.fna) [100%] 1 of 1 ✔
[c0/d1417d] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔
[b1/21ecdc] process > epinano_flow:splitBams (Splitting of wt_s.bam on pieces999.fa) [100%] 13320 of 13320 ✔
[96/675a4a] process > epinano_flow:indexReference (Indexing pieces999.fa) [100%] 6660 of 6660 ✔
[7b/e1b34a] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (wt___pieces1000_s.bam on wt) [ 0%] 1 of 13319, faile..
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
[- ] process > epinano_flow:makeEpinanoPlots_mis -
[- ] process > epinano_flow:makeEpinanoPlots_del -
[7b/56cd78] process > tombo_common_flow:multiToSingleFast5 (wt___PAI52977_pass_d5246539_0) [100%] 2 of 2 ✔
[5d/6c2b2c] process > tombo_common_flow:TOMBO_RESQUIGGLE_RNA:resquiggle_rna (mod___PAI53910_pass_e5b1... [100%] 2 of 2 ✔
[70/ecd689] process > getChromInfo (reference.fa) [100%] 1 of 1 ✔
[1c/0082b5] process > tombo_msc_flow:TOMBO_GET_MODIFICATION_MSC:getModificationsWithModelSampleCompar... [100%] 1 of 1 ✔
[- ] process > bedGraphToWig_msc [ 0%] 0 of 4
[75/9ce51c] process > tombo_lsc_flow:TOMBO_GET_MODIFICATION_LSC:getModificationsWithLevelSampleCompar... [100%] 1 of 1 ✔
[- ] process > bedGraphToWig_lsc [ 0%] 0 of 4
[- ] process > wigToBigWig [ 0%] 0 of 4
[- ] process > mergeTomboWigsPlus -
[- ] process > mergeTomboWigsMinus -
[01/de5629] process > EPINANO_VER:getVersion [100%] 1 of 1 ✔
[32/7baa16] process > NANOPOLISH_VER:getVersion [100%] 1 of 1 ✔
[d6/512a0f] process > NANOCOMPORE_VER:getVersion [100%] 1 of 1 ✔
[7e/1a182b] process > TOMBO_VER:getVersion [100%] 1 of 1 ✔
Error executing process > 'epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (wt___pieces01_s.bam on wt)'
Caused by:
Process exceeded running time limit (1d 6h)
Command executed:
Epinano_Variants.py -n 50 -R pieces01.fa -b wt___pieces01_s.bam -s $SAM2TSV --type t
for i in *.csv; do gzip $i; done
Command exit status:
-
Command output:
(empty)
Command error:
wt___pieces01_s_TMP_ already exists, will overwrite it
Process Process-2:
Traceback (most recent call last):
File "/usr/local/python/versions/3.6.3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/python/versions/3.6.3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/project/Epinano1.2.0/Epinano_Variants.py", line 45, in split_tsv_for_per_site_var_freq
firstline = next (tsv)
StopIteration
Work dir:
/path/mop2/MOP2/mop_mod/work/81/b93824c0eee7cd78fa4376609f1fe4
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
(mop2) prom@PC48A067:/path/mop2/MOP2/mop_mod$
(mop2) prom@PC48A067:/path/mop2/MOP2/mop_mod$ /path/mop2/nextflow run /path/mop2/MOP2/mop_mod/mop_mod.nf -with-singularity -profile local
N E X T F L O W ~ version 22.10.2
Launching `/path/mop2/MOP2/mop_mod/mop_mod.nf` [reverent_stone] DSL2 - revision: 83320fa996
╔╦╗╔═╗╔═╗ ╔╦╗┌─┐┌┬┐
║║║║ ║╠═╝ ║║║│ │ ││
╩ ╩╚═╝╩ ╩ ╩└─┘─┴┘
====================================================
BIOCORE@CRG Master of Pores 2. Detection of RNA modification - N F ~ version 2.0
====================================================
***************** Input files *******************
input_path : /data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_preprocess/output_1_1_fast5/
comparison : /data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_mod/comparison.tsv
********** reference has to be the genome *************
reference : /data/PublicData/refSeq_transcripts/GRCh38_latest_rna.fna
output : /data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_mod/output_mod
pars_tools : /data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_mod/tools_opt.tsv
************************* Flows *******************************
epinano : YES
nanocompore : YES
tombo_lsc : YES
tombo_msc : YES
email :
Skipping the email
[- ] process > checkRef -
[- ] process > epinano_flow:splitReference -
[- ] process > epinano_flow:splitBams -
[- ] process > epinano_flow:indexReference -
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
[- ] process > checkRef [ 0%] 0 of 1
[- ] process > epinano_flow:splitReference -
[- ] process > epinano_flow:splitBams -
[- ] process > epinano_flow:indexReference -
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
executor > local (1)
[- ] process > checkRef [ 0%] 0 of 1
[- ] process > epinano_flow:splitReference -
[- ] process > epinano_flow:splitBams -
[- ] process > epinano_flow:indexReference -
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
executor > local (2)
[- ] process > checkRef [ 0%] 0 of 1
[- ] process > epinano_flow:splitReference -
[- ] process > epinano_flow:splitBams -
[- ] process > epinano_flow:indexReference -
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
executor > local (14)
[5d/3b6f07] process > checkRef (Checking GRCh38_latest_rna.fna) [100%] 1 of 1 ✔
[27/434f09] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔
[- ] process > epinano_flow:splitBams [ 0%] 0 of 13320
[- ] process > epinano_flow:indexReference [ 0%] 0 of 6660
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
[- ] process > epinano_flow:makeEpinanoPlots_mis -
[- ] process > epinano_flow:makeEpinanoPlots_del -
[95/d371da] process > compore_polish_flow:getChromInfo (reference.fa) [100%] 1 of 1 ✔
[1d/bced2a] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:index (mod) [100%] 2 of 2 ✔
[82/021158] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalign (wt--PAI52977_pass_d524653... [ 50%] 1 of 2
[- ] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse -
[- ] process > compore_polish_flow:mean_per_pos -
executor > local (14)
[5d/3b6f07] process > checkRef (Checking GRCh38_latest_rna.fna) [100%] 1 of 1 ✔
[27/434f09] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔
[- ] process > epinano_flow:splitBams [ 0%] 0 of 13320
[- ] process > epinano_flow:indexReference [ 0%] 0 of 6660
[- ] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES -
[- ] process > epinano_flow:joinEpinanoRes -
[- ] process > epinano_flow:makeEpinanoPlots_ins -
[- ] process > epinano_flow:makeEpinanoPlots_mis -
[- ] process > epinano_flow:makeEpinanoPlots_del -
[95/d371da] process > compore_polish_flow:getChromInfo (reference.fa) [100%] 1 of 1 ✔
[1d/bced2a] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:index (mod) [100%] 2 of 2 ✔
[82/021158] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalign (wt--PAI52977_pass_d524653... [100%] 1 of 1
[- ] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse -
[- ] process > compore_polish_flow:mean_per_pos -
[- ] process > compore_polish_flow:concat_mean_per_pos -
[- ] process > compore_polish_flow:concat_csv_files -
[- ] process > compore_polish_flow:NANOCOMPORE_SAMPLE_COMPARE:sampleCompare -[d1/8be1dd] process > tombo_common_flow:multiToSingleFast5 (wt___PAI52977_pass_d5246539_0) [100%] 2 of 2 ✔
[- ] process > tombo_common_flow:TOMBO_RESQUIGGLE_RNA:resquiggle_rna [ 0%] 0 of 2
[9e/834933] process > getChromInfo (reference.fa) [100%] 1 of 1 ✔
[- ] process > tombo_msc_flow:TOMBO_GET_MODIFICATION_MSC:getModificationsWithModelSampleCompare -
[- ] process > bedGraphToWig_msc -
[- ] process > tombo_lsc_flow:TOMBO_GET_MODIFICATION_LSC:getModificationsWithLevelSampleCompare -
[- ] process > bedGraphToWig_lsc -
[- ] process > wigToBigWig -
[- ] process > mergeTomboWigsPlus -[- ] process > mergeTomboWigsMinus -[ae/8a8e62] process > EPINANO_VER:getVersion [100%] 1 of 1 ✔
[cd/286cbd] process > NANOPOLISH_VER:getVersion [100%] 1 of 1 ✔
[e7/3dacdf] process > NANOCOMPORE_VER:getVersion [100%] 1 of 1 ✔
[77/6cc28e] process > TOMBO_VER:getVersion [100%] 1 of 1 ✔
Pulling Singularity image docker://biocorecrg/mopmod:0.7 [cache /data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_mod/../singularity/biocorecrg-mopmod-0.7.img]
Error executing process > 'compore_polish_flow:mean_per_pos (mod)'
Caused by:
Failed to pull singularity image
command: singularity pull --name biocorecrg-mopmod-0.7.img.pulling.1668656146061 docker://biocorecrg/mopmod:0.7 > /dev/null
status : 255
message:
FATAL: While making image from oci registry: error fetching image to cache: failed to get checksum for docker://biocorecrg/mopmod:0.7: error pinging docker registry registry-1.docker.io: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: Temporary failure in name resolution
(mop2) prom@PC48A067:/data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_mod$
params.config.test file:
params {
input_path = "/data/adaniyarov/directRNA_DRS/data_Control_P2_EpiNano/mop2/MOP2/mop_preprocess/output_1_1_fast5/"
comparison = "$baseDir/comparison.tsv"
reference = "/data/PublicData/refSeq_transcripts/GRCh38_latest_rna.fna"
output = "$baseDir/output_mod"
pars_tools = "$baseDir/tools_opt.tsv"
// flows
epinano = "YES"
nanocompore = "YES"
tombo_lsc = "YES"
tombo_msc = "YES"
// epinano plots
epinano_plots = "YES"
email = ""
}
conf/local.config file:
process {
executor = 'local'
cpus = 110
memory = '300GB'
cache='lenient'
container = 'biocorecrg/mopprepr:0.7'
containerOptions = { workflow.containerEngine == "docker" ? '-u $(id -u):$(id -g)': null}
withLabel: big_cpus_ignore {
errorStrategy = 'ignore'
}
withLabel: basecall_gpus {
maxForks = 1
containerOptions = { workflow.containerEngine == "singularity" ? '--nv':
( workflow.containerEngine == "docker" ? '-u $(id -u):$(id -g) --gpus all': null ) }
}
}
Dear @aset8, I see that you are modifying the config... since your machine has no job scheduling system I would use the local config. But I won't use 110 CPUs because you need to parallelize several executions. You can take this config as an example and remove the name of the queues and other custom things like clusterOptions
https://github.com/biocorecrg/MOP2/blob/main/conf/sge.config
Let me know how it goes.
Luca
@aset8 Not sure, if this issue got resolved. But one thing that can mysteriously terminate the MOP2 jobs being run are memory watchdogs being run on shared computing resources. These would terminate the jobs with no information being logged to nextflow logs. I would try running with less CPUs and ~30 to ~60 GB of memory. Also, try running with Epinano only or tombo only to narrow down the list of suspects.
Yes with MoP2 you can just use one workflow at time, so you can narrow down the more intensive processes. Let me know how it goes.
I have a similar msg when running on a high performance center by SLURM job (372 G memory and 56 cores):
Error executing process > 'epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (STM_2uM_fast5___pieces1016_s.bam on STM_2uM_fast5)'
Caused by:
Process exceeded running time limit (6h)
Command executed:
Epinano_Variants.py -n 8 -R pieces1016.fa -b STM_2uM_fast5___pieces1016_s.bam -s $SAM2TSV --type t
for i in *.csv; do gzip $i; done
Command exit status:
-
Command output:
(empty)
Command error:
STM_2uM_fast5___pieces1016_s_TMP_ already exists, will overwrite it
Process Process-2:
Traceback (most recent call last):
File "/usr/local/python/versions/3.6.3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/python/versions/3.6.3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/project/Epinano1.2.0/Epinano_Variants.py", line 45, in split_tsv_for_per_site_var_freq
firstline = next (tsv)
StopIteration
Work dir:
/staging/biology/andreachi77/MOP2/mop_mod/work/7f/cd174ae48d7358f207700acb952a3d
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
How do I continue?
which profile are you using?
Hi,
Are these profiles profiles you mentioned?
tools_opt.tsv:
#flows tool extrapars
epinano epinano ""
nanocompore nanopolish ""
nanocompore nanocompore "--sequence_context 2 --downsample_high_coverage 10000"
tombo_resquiggling tombo ""
tombo_msc tombo ""
tombo_lsc tombo ""
params.config:
params {
input_path = "$baseDir/../mop_preprocess/output/"
comparison = "$baseDir/comparison.tsv"
reference = "$baseDir/../anno/Human.v41CRCh38.p13.transcripts.fa"
output = "$baseDir/output_mod"
pars_tools = "$baseDir/tools_opt.tsv"
// flows
epinano = "YES"
nanocompore = "YES"
tombo_lsc = "YES"
tombo_msc = "YES"
// epinano plots
epinano_plots = "YES"
email = "andrea_chi@hotmail.com"
}
@AndreaYCT No, there are config files in conf subdirectory, like slurm.config local.config and so on. If you use the slurm profile, each of the processes are assigned a label from the items in slurm profile for example. you can set the number of cpus and memory allocated for a process with those config files. It seems you are using the local profile, so you may want to modify the labels listed in local.config
Salutations,
First of all, thank you very much for your great and important work. I have a few questions. I am running Master of Pores 2 on a workstation from Nanopore Promethion (370 gb ram, 112 cores). The first stage, mop_preprocess.nf finished pretty quickly. What cannot be said about the second stage - mop_mod.nf.
The stage of searching for modifications (mop_mod.nf) has been going on for almost 24 hours, no signs of work are visible. Htop shows no load on the server.
I tried it with -profile standard and -profile local. No change.
Could you please advise how to properly configure the configurations to get the work done as quickly as possible?
There are a few more questions about using references (genome, transcriptome) and running Master of Pores 2 on another machine. I will open another discussion. Thanks.