Putnam-Lab / Lab_Management

13 stars 7 forks source link

nf-core methylseq input fastq.gz options update in new version #60

Closed daniellembecker closed 1 year ago

daniellembecker commented 1 year ago

I was recently following an old script of mine using the nf-core methylseq pipeline and starting to run into a new error that was not an issue before. I used to use the command '--reads <pathway to files/*R{1,2}.fasq.gz' in my script seen below for the input of the files to the pipeline, but with the newest nf-core methylseq version, this no longer works. You must now follow the new instructions here and use the '--input' command and create a csv file that lists all your samples and a pathway to them.

Notebook post for full scripts and steps here.

Old script:

nextflow run nf-core/methylseq -profile singularity \ --aligner bismark \ --fasta /data/putnamlab/dbecks/Becker_E5/Pmean_WGBS_compare/refs/.fasta \ --save_reference \ --reads '/data/putnamlab/dbecks/Becker_E5/Pmean_WGBS_compare/data/raw/*_R{1,2}_$' --clip_r1 10 \ --clip_r2 10 \ --three_prime_clip_r1 10 --three_prime_clip_r2 10 \ --non_directional \ --cytosine_report \ --relax_mismatches \ --unmapped \ --outdir Pmean_WGBS \ -name Pmean_WGBS_methyl

Output error ERROR: Validation of pipeline parameters failed! --input: string [/data/putnamlab/dbecks/Becker_E5/Pmean_WGBS_compare/data/raw/*_R{1,2}_001.fastq.gz] does not match pattern ^\S+\.csv$ (/data/putnamlab/dbecks/Becker_E5/Pmean_WGBS_compare/data/raw/*_R{1,2}_001.fastq.gz)

Once you make the new csv file, you will change out the '--reads' command to '--input '[path to samplesheet file]''

daniellembecker commented 1 year ago

Example .csv file

Screen Shot 2023-03-14 at 11 39 34 AM
daniellembecker commented 1 year ago

Another new update to the script:

There is an update module version you also need to add to your script:

module load Nextflow/22.10.1

daniellembecker commented 1 year ago

I was still getting an error with this version of nf core and had to change the singularity model in interactive mode:

So I think I had the right idea with the singularity bind directories, but the wrong way to set them. Try editing nano ~/.nextflow/assets/nf-core/methylseq/nextflow.config (in an interactive session) and changing singularity.autoMounts = false and then add a line right after that singularity.runOptions = '-B /glfs'. It should look like this:

singularity {
    singularity.enabled    = true
    singularity.autoMounts = false
    singularity.runOptions = '-B /glfs'
    docker.enabled         = false
    podman.enabled         = false
    shifter.enabled        = false
    charliecloud.enabled   = false
}

The automounts only mounts the work directory, but it has symlinked the reference file from outside that directory and now it can’t see it. It should really bind the mount point that the work directory is on (which is what the config changes above do). I think it’s a bug in nextflow, or maybe the pipeline (I don’t know where one begins and the other ends for this stuff).