Closed Philippemetagenomica closed 1 year ago
For changing resource allocations please have a look at https://nf-co.re/mag/2.1.1/usage#resource-requests
Meaning, add to your coomand line -c gtdb.config
where gtdb.config
contains:
process {
withName: GTDBTK_DB_PREPARATION {
cpus = 12
memory = 64.GB
time = 300.h
}
}
and modify the cpus/memory/time to your needs.
The memory limit is a common warning with docker (iirc). This very rarely actually causes an error.
The actual problem is the gtdb tar file appears to be corrupt. I would suggest downloading again if this is your own file, or if you were running the test profile delete the work directory and retry.
For changing resource allocations please have a look at https://nf-co.re/mag/2.1.1/usage#resource-requests Meaning, add to your command line
-c gtdb.config
wheregtdb.config
contains:process { withName: GTDBTK_DB_PREPARATION { cpus = 12 memory = 64.GB time = 300.h } }
and modify the cpus/memory/time to your needs.
Thank you very much! I appreciate your support! It worked! However, it appears another error:
[e3/61244c] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (1)
[02/d484ff] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (2)
[ac/18d2a2] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (3)
Error executing process > 'NFCORE_MAG:MAG:MEGAHIT (be30by)'
Caused by:
Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250)
Command executed:
megahit -t "16" -m 137438953472 -1 "be30by.phix_removed.unmapped_1.fastq.gz" -2 "be30by.phix_removed.unmapped_2.fastq.gz" -o MEGAHIT --out-prefix "be30by" gzip -c "MEGAHIT/be30by.contigs.fa" > "MEGAHIT/be30by.contigs.fa.gz"
megahit --version | sed "s/MEGAHIT v//" > megahit.version.txt
Command exit status: 250
Command output: (empty)
Command error: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. 2022-05-27 21:52:23 - MEGAHIT v1.2.9
I think it will be a similar solution. But how do I have to proceed from this?
By the way...I was wondering if this error could be repeated in a couple of modules during the pipeline. So, how can I set the parameters adjusting to my machine capacity?
The memory limit is a common warning with docker (iirc). This very rarely actually causes an error.
The actual problem is the gtdb tar file appears to be corrupt. I would suggest downloading again if this is your own file, or if you were running the test profile delete the work directory and retry.
Yeah, I tried, but didn't work. unfortunately
By the way...I was wondering if this error could be repeated in a couple of modules during the pipeline. So, how can I set the parameters adjusting to my machine capacity?
The default settings of the pipeline allow typical analysis to run smoothly. nextflow pipelines are made for parallel processing, i.e. not much more computational resources than needed should be allocated so that processes can run in parallel. If you want to just set all resource settings to the maximum you have, it will make the pipeline generally much slower (only ever 1 process runs).
Having said that, here is how you can do a untargeted change of setting, -c process.config
where process.config
contains:
process {
cpus = 12
memory = 64.GB
time = 300.h
}
and adjust the values to your needs. Not sure that will help, but you can certainly try.
By the way...I was wondering if this error could be repeated in a couple of modules during the pipeline. So, how can I set the parameters adjusting to my machine capacity?
The default settings of the pipeline allow typical analysis to run smoothly. NetFlow pipelines are made for parallel processing, i.e. not much more computational resources than needed should be allocated so that processes can run in parallel. If you want to just set all resource settings to the maximum you have, it will make the pipeline generally much slower (only ever 1 process runs). Having said that, here is how you can do a untargeted change of setting,
-c process.config
whereprocess.config
contains:process { cpus = 12 memory = 64.GB time = 300.h }
and adjust the values to your needs. Not sure that will help, but you can certainly try.
Hi! I tried this one. But it appeared the same error in the beginning.
[85/c328bc] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (1)
[d7/2f80a0] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (2)
[f8/c2c701] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250) -- Execution is retried (3)
Error executing process > 'NFCORE_MAG:MAG:MEGAHIT (be30by)'
Caused by:
Process NFCORE_MAG:MAG:MEGAHIT (be30by)
terminated with an error exit status (250)
Command executed:
megahit -t "16" -m 137438953472 -1 "be30by.phix_removed.unmapped_1.fastq.gz" -2 "be30by.phix_removed.unmapped_2.fastq.gz" -o MEGAHIT --out-prefix "be30by" gzip -c "MEGAHIT/be30by.contigs.fa" > "MEGAHIT/be30by.contigs.fa.gz"
megahit --version | sed "s/MEGAHIT v//" > megahit.version.txt
Command exit status: 250
Command output: (empty)
Command error: 2022-05-31 04:47:18 - MEGAHIT v1.2.9 2022-05-31 04:47:18 - Using megahit_core with POPCNT support 2022-05-31 04:47:18 - Convert reads to binary library gzip: invalid magic
So, I tried to combine the two config file in one. like this: process { cpus = 12 memory = 64.GB time = 300.h withName: GTDBTK_DB_PREPARATION { cpus = 12 memory = 64.GB time = 300.h } }
And again, the same mistake. I don't know what's happening. It's supposed to be a simple task. I don't know what is wrong. The download of the pipeline, the docker, and the files. Am I not worthy?
Hi @Philippemetagenomica
Looking at both the exit status ([f8/c2c701] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by) terminated with an error exit status (250) -- Execution is retried (3)
here 250)
and also the command errorr:
2022-05-31 04:47:18 - MEGAHIT v1.2.9
2022-05-31 04:47:18 - Using megahit_core with POPCNT support
2022-05-31 04:47:18 - Convert reads to binary library
gzip: invalid magic
Docker exit code 250 means 'no such file or directory'
And 'invalid magic' means the file is corrupted
Could you maybe try going into the working directory (something like: cd work/f8/c2c701<...>
, and cat MEGAHIT/be30by.contigs.fa
. Maybe you can see what the error is.
I am still not completely convinced this is and error with CPUs/Time/Memory, as I don't see why this would result in corrupted files. The tool would normally die with a different error without trying to write files if you run out of memory.
Also I note you've not posted your actual command, this would also be helpful, as well as your .nextflow.log
files of the runs
Hi @Philippemetagenomica
Looking at both the exit status (
[f8/c2c701] NOTE: Process NFCORE_MAG:MAG:MEGAHIT (be30by) terminated with an error exit status (250) -- Execution is retried (3)
here 250)and also the command errorr:
2022-05-31 04:47:18 - MEGAHIT v1.2.9 2022-05-31 04:47:18 - Using megahit_core with POPCNT support 2022-05-31 04:47:18 - Convert reads to binary library gzip: invalid magic
Docker exit code 250 means 'no such file or directory'
And 'invalid magic' means the file is corrupted
Could you maybe try going into the working directory (something like:
cd work/f8/c2c701<...>
, andcat MEGAHIT/be30by.contigs.fa
. Maybe you can see what the error is.I am still not completely convinced this is and error with CPUs/Time/Memory, as I don't see why this would result in corrupted files. The tool would normally die with a different error without trying to write files if you run out of memory.
Hi @jfy133 So I did what you reccomenend
(genomics) philippe@proliant[c2c7012ccf6d33f184460ac4df19e1] ls [ 8:31PM]
MEGAHIT
be30by.phix_removed.unmapped_1.fastq.gz
be30by.phix_removed.unmapped_2.fastq.gz
(genomics) philippe@proliant[c2c7012ccf6d33f184460ac4df19e1] cd MEGAHIT
(genomics) philippe@proliant[MEGAHIT] ls [ 8:31PM]
be30by.log checkpoints.txt intermediate_contigs options.json tmp
(genomics) philippe@proliant[MEGAHIT] cd intermediate_contigs [ 8:31PM]
(genomics) philippe@proliant[intermediate_contigs] ls [ 8:33PM]
(genomics) philippe@proliant[intermediate_contigs] cd .. [ 8:33PM]
(genomics) philippe@proliant[MEGAHIT] ls [ 8:33PM]
be30by.log checkpoints.txt intermediate_contigs options.json tmp
(genomics) philippe@proliant[MEGAHIT] cat be30by.log [ 8:33PM]
2022-05-31 04:47:09 - MEGAHIT v1.2.9
2022-05-31 04:47:09 - Using megahit_core with POPCNT support
2022-05-31 04:47:09 - Convert reads to binary library
2022-05-31 04:47:09 - command /usr/local/bin/megahit_core_popcnt buildlib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib
2022-05-31 04:47:09 - b"terminate called after throwing an instance of 'std::invalid_argument'"
2022-05-31 04:47:09 - b' what(): Cannot open file eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/inpipe.pe1.0'
2022-05-31 04:47:10 - Error occurs, please refer to /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/be30by.log for detail
2022-05-31 04:47:10 - Command: /usr/local/bin/megahit_core_popcnt buildlib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib; Exit code -6
There is no file called be30by,contig.fa
(genomics) philippe@proliant[MEGAHIT] which be30by.contigs.fa [ 8:37PM]
be30by.contigs.fa not found
I totally agree with you. Its was not supposed to be a problem of CPU/memory. I was wondering if it was something with the download of the pipeline. Well, I just follow the instruction on nf-core/mag. I still don't understand what's happening. This is what i found about the run
(genomics) philippe@proliant[pipeline_info] ls [ 8:50PM]
execution_report_2022-05-31_00-32-06.html
execution_timeline_2022-05-31_00-32-06.html
execution_trace_2022-05-31_00-32-06.txt
pipeline_dag_2022-05-31_00-32-06.svg
Do you want to see any file?
Also I note you've not posted your actual command, this would also be helpful, as well as your
.nextflow.log
files of the runs
Oh, sorry, the command that I used was:
nextflow run nf-core/mag -profile docker -c process.config --input '*_R{1,2}.fastq.gz'
From the beginning. The only thing that I changed a couple of times was the .config file. Which was suggested in a couple of comments above. That's it.
Sorry @Philippemetagenomica I had to edit your previous post because it's very difficult to read outside code blocks. Please use markdown (triple backticks around the code/error logs) or use the tool bar on the comment box. sorry and thank you š
So it seems again there is a problem with MEGAHIT crashing
2022-05-31 04:47:09 - b"terminate called after throwing an instance of 'std::invalid_argument'"
MEGAHIT suggests:
2022-05-31 04:47:10 - Error occurs, please refer to /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/be30by.log for detail
Are you running wiht multiple samples? You could try exlcuding that one sample and see if the rest of the dataset runs properly.
Sorry @Philippemetagenomica I had to edit your previous post because it's very difficult to read outside code blocks. Please use markdown (triple backticks around the code/error logs) or use the tool bar on the comment box. sorry and thank you š
So it seems again there is a problem with MEGAHIT crashing
2022-05-31 04:47:09 - b"terminate called after throwing an instance of 'std::invalid_argument'"
MEGAHIT suggests:
2022-05-31 04:47:10 - Error occurs, please refer to /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/be30by.log for detail
Are you running wiht multiple samples? You could try exlcuding that one sample and see if the rest of the dataset runs properly.
OMG! Sorry! šš»šš»šš» I didn't know that was helpful to Witte the code in box. Sorry. So, that's the point. It's one single sample with R1 and RE reads. I'm testing the pipeline to use in all of the samples š¬
The be30by.log is:
`2022-05-31 04:47:09 - MEGAHIT v1.2.9
2022-05-31 04:47:09 - Using megahit_core with POPCNT support
2022-05-31 04:47:09 - Convert reads to binary library
2022-05-31 04:47:09 - command /usr/local/bin/megahit_core_popcnt buildlib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib
2022-05-31 04:47:09 - b"terminate called after throwing an instance of 'std::invalid_argument'"
2022-05-31 04:47:09 - b' what(): Cannot open file eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/inpipe.pe1.0'
2022-05-31 04:47:10 - Error occurs, please refer to /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/be30by.log for detail
2022-05-31 04:47:10 - Command: /usr/local/bin/megahit_core_popcnt buildlib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib;
Exit code -6
could it be that the path /mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib
makes problems? I have never seen square brackets, i.e. []
, in paths... It might be that MEGAHIT is upset by Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]
?
Could you process the data in a folder/path that does not include brackets?
could it be that the path
/mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib
makes problems? I have never seen square brackets, i.e.[]
, in paths... It might be that MEGAHIT is upset byCharacterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]
? Could you process the data in a folder/path that does not include brackets?
Hi! Your suggestion was very helpful. It indeed goes further on the analysis. Thank you! But, unfortunately, we got another error.
Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.
The full error message was:
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-be30by.3.fa)'
Caused by:
Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-be30by.3.fa)` terminated with an error exit status (1)
(genomics) philippe@proliant[teste_pipeline] cd results/GenomeBinning/QC/BUSCO
(genomics) philippe@proliant[BUSCO] cat MEGAHIT-be30by.3.fa [11:47AM]
cat: MEGAHIT-be30by.3.fa: No such file or directory
Also, there is no directory with the taxonomy results. We are close! Just a few errors to correct.
could it be that the path
/mnt/philippe/Sequenciamento eDNA/Characterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]/Raw_Data/work/f8/c2c7012ccf6d33f184460ac4df19e1/MEGAHIT/tmp/reads.lib
makes problems? I have never seen square brackets, i.e.[]
, in paths... It might be that MEGAHIT is upset byCharacterizing_lignin-adapted_microbial_communities__eDNA_3rd_pass_30_C_BE-Lig_BY_[eDN3rd30CBELigBY_FD]
? Could you process the data in a folder/path that does not include brackets?Hi! Your suggestion was very helpful. It indeed goes further on the analysis. Thank you! But, unfortunately, we got another error.
Workflow execution completed unsuccessfully! The exit status of the task that caused the workflow execution to fail was: 1. The full error message was:
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-be30by.3.fa)' Caused by: Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-be30by.3.fa)` terminated with an error exit status (1)
(genomics) philippe@proliant[teste_pipeline] cd results/GenomeBinning/QC/BUSCO
(genomics) philippe@proliant[BUSCO] cat MEGAHIT-be30by.3.fa [11:47AM] cat: MEGAHIT-be30by.3.fa: No such file or directory
Also, there is no directory with the taxonomy results. We are close! Just a few errors to correct.
The command that I used was
nextflow run nf-core/mag -profile docker -c process.config --input '*_R{1,2}.fastq.gz'
@Philippemetagenomica it would be helpful if you upload here the .nextflow.log
files OR paste the full error message as you were doing earlier.
This has the error report but not the error. Sometimes Nextflow will crash too quickly so it doens't print the whole error message to screen, in which case the .nextflow.log
file (whereever you ran the command) will record this.
(but thank you for the code blocks, it makes this much easier :pray:)
@Philippemetagenomica it would be helpful if you upload here the
.nextflow.log
files OR paste the full error message as you were doing earlier.
This has the error report but not the error. Sometimes Nextflow will crash too quickly so it doens't print the whole error message to screen, in which case the
.nextflow.log
file (whereever you ran the command) will record Nextflow Workflow Report.pdf this.(but thank you for the code blocks, it makes this much easier š)
Oh! Its because this error is so big. haha I had to convert the HTML file to a pdf file. I belive it will be more helpful.
@jfy133 Here, the PDF file report.
@Philippemetagenomica please send the .nextflow.log
the file is literally called that. However if you read the end of the error in the PDF you can see what it tells you what to do
This is a likely an issue BUSCO now rather than nf-core/mag itself though.
@Philippemetagenomica please send the
.nextflow.log
the file is literally called that. However if you read the end of the error in the PDF you can see what it tells you what to do
This is a likely an issue BUSCO now rather than nf-core/mag itself though.
OMG! How about now? What can we do?
It says in the rest of the error message, change into the working directory it says it was working in when it failed and read the error message ;)
Hi, I will close this issue since the original problem was solved. If the BUSCO
error still occurs, try with the most recent nf-core/mag version or open a new issue.
Hello Everybody! I have some issues with this pipeline. It seems that my machine doesn't have enough memory to complete the workflow. May I could change the number of threads that the pipeline uses? Please, be gentle with the explanation. I'm not exceptional with linux.
[nf-core/mag] Pipeline completed with errors- WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info. Error executing process > 'NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)'
Caused by: Process
NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)
terminated with an error exit status (2)Command executed:
mkdir database tar -xzf gtdbtk_r202_data.tar.gz -C database --strip 1
Command exit status: 2
Command output: (empty)
Command error: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now