Open FutureFellwalker opened 10 months ago
Hey,
I have successfully run the pipeline using --aligner bismark
I am having the same issue reported here when I am trying to repeat the analysis using bwameth
instead of bismark
:
The following test works fine
nextflow run nf-core/methylseq --aligner bwameth --outdir $RefDir -profile test,conda
.fasta
genome--bwa_meth_index
generated with bwameth.py index Genome.fasta
*.fastq.gz
files--fasta_index
, nor do I know how to feed multiple sample indexes into the pipeline. I assume this is something that the pipeline does by itself using the --input $SampleSheet.csv
, but am I wrong?N E X T F L O W ~ version 23.10.0 nf-core/methylseq v2.5.0-g66c6138 Conda
nextflow run nf-core/methylseq \
--input $SampleSheet.csv \
--fasta $Genome.fasta \
--outdir $RefDir \
--bwa_meth_index $DirContainingRefFastaGenome \
--email user@email.com \
--aligner bwameth \
--comprehensive \
--max_cpus 40 \
--max_memory 400.GB \
-profile conda
Error:
-[nf-core/methylseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_METHYLSEQ:METHYLSEQ:PREPARE_GENOME:SAMTOOLS_FAIDX'
Caused by:
Not a valid path value type: groovyx.gpars.dataflow.DataflowVariable (DataflowVariable(value=/local/storage/Projects/ppar_emseq/data/005_ppar_emseq_muscle/sub/P_parae_wMito+lambda+puc19.fasta))
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
I have tried including and excluding the following flags, but no version has worked:
--save_align_intermeds \
--ignore_flags \
--save_reference \
I appreciate any help! :)
Maximiliano
Mmm, I guess I found the problem.
This is NOT a bug. It is the result of insufficient input to the pipeline with bwameth
You need to provide the 2 indexed versions of the genome
--bwa_meth_index $DirContainingRefGenome versions \
Install bwameth and use
bwameth.py --reference $RefGenome.fasta
This will produce a series of indexed converted genomes in a container folder (Later, you will use that container folder path as the argument in the--bwa_meth_index
flag)
--fasta_index $RefGenome.fai \
Use SamTools
samtools faidx $RefGenome.fasta
It will produce a$RefGenome.fai
This is an index of the non-converted genome See a good explanation here
This is NOT a bug. It is the result of insufficient input to the pipeline with
bwameth
I disagree. You've found your way around the bug by providing more inputs, meaning that the pipeline did not need to create them. However, the pipeline should automatically generate the index files itself if only given a Fasta file.
This is what we need to fix:
Not a valid path value type: groovyx.gpars.dataflow.DataflowVariable (DataflowVariable(value=/local/storage/Projects/ppar_emseq/data/005_ppar_emseq_muscle/sub/P_parae_wMito+lambda+puc19.fasta))
Looks like there's something wrong with how the channels are being built / supplied.
Hi,
I also encountered the same issue when trying to run bwameth
without supplying the indexes. But in my case, supplying --fasta_index
was sufficient to work around the bug (my run succeeded), which narrowed down and suggested that the issue arises somewhere during the process of building index using samtools faidx
.
Has this been fixed? I'm still having the same error when trying to use bwameth as the aligner
Has this been fixed? I'm still having the same error when trying to use bwameth as the aligner
@drothen15 Not sure if a fix has been included in a release yet, but I was able to fix this locally by modifying the SAMTOOLS_FAIDX
process inputs.
In the prepare_genome.nf
file, when calling the SAMTOOLS_FAIDX
process if the params.fasta_index
is not specified, there is an issue when trying to pass a value channel in as part of the tuple. I think it would work if you either 1) removed the tuple, or 2) switched the ch_fasta
input to be just the file(params.fasta)
instead of Channel.value(file(params.fasta))
.
I resolved it by removing the tuple altogether, and then updating the main.nf
containing the SAMTOOLS_FAIDX
process to expect a val and path input, instead of a tuple with a val and path. If you just change the original process input as specified in prepare_genome.nf
to be a file instead of the value channel, there should be no need to update the SAMTOOLS_FAIDX:main.nf
.
One solution (that I used):
// inside prepare_genome.nf
SAMTOOLS_FAIDX([:], ch_fasta)
instead of
SAMTOOLS_FAIDX([[:], ch_fasta])
And then modifying SAMTOOLS_FAIDX:main.nf
:
input:
val(meta)
path(fasta)
instead of
input:
tuple val(meta), path(fasta)
Mmm, I guess I found the problem. This is NOT a bug. It is the result of insufficient input to the pipeline with
bwameth
You need to provide the 2 indexed versions of the genome
--bwa_meth_index $DirContainingRefGenome versions \
Install bwameth and use
bwameth.py --reference $RefGenome.fasta
This will produce a series of indexed converted genomes in a container folder (Later, you will use that container folder path as the argument in the--bwa_meth_index
flag)
--fasta_index $RefGenome.fai \
Use SamTools
samtools faidx $RefGenome.fasta
It will produce a$RefGenome.fai
This is an index of the non-converted genome See a good explanation here
Sorry, I have a doubt: is the input to --bwa_meth_index
, i.e. $DirContainingRefGenome, the output of the bwameth.py --reference $RefGenome.fasta
as mentioned in this reply or the output of bwameth.py index $RefGenome.fasta
?
Thank you!
Mmm, I guess I found the problem. This is NOT a bug. It is the result of insufficient input to the pipeline with
bwameth
You need to provide the 2 indexed versions of the genome
--bwa_meth_index $DirContainingRefGenome versions \
Install bwameth and use
bwameth.py --reference $RefGenome.fasta
This will produce a series of indexed converted genomes in a container folder (Later, you will use that container folder path as the argument in the--bwa_meth_index
flag)
--fasta_index $RefGenome.fai \
Use SamTools
samtools faidx $RefGenome.fasta
It will produce a$RefGenome.fai
This is an index of the non-converted genome See a good explanation hereSorry, I have a doubt: is the input to
--bwa_meth_index
, i.e. $DirContainingRefGenome, the output of thebwameth.py --reference $RefGenome.fasta
as mentioned in this reply or the output ofbwameth.py index $RefGenome.fasta
?Thank you!
I think the later, for me that's how I got it to work
Description of the bug
Hi.
I'm encountering an error with the methylsig pipeline when using bwameth aligner. The pipeline runs fine with Bismark, but my mapping is low (~40%) and I want to compare against bwameth.
The error seems to be in preparing the reference genome index. I tried saving the reference genome locally and running offline, but encounter the same issue. Tried with both methylsig 2.4.0 and 2.5.0.
Command used and terminal output
Relevant files
nextflow.log
System information
Nextflow version 23.04.3 HPC, LSF executor nf-core/methylseq 2.4.0 and 2.5.0 Singularity container