CDCgov / mycosnp-nf

CDCgov/mycosnp-nf
Apache License 2.0
36 stars 33 forks source link

Minimum read length parameter #76

Closed gaworj closed 2 years ago

gaworj commented 2 years ago

Is your feature request related to a problem? Please describe. I have tried to use mycosnp for some of the reference Candida auris clade specific SRA datasets + my own Candida auris sample which is paired end Illumina dataset but with untypical read length i.e. 120 bp for Read 1 and 30 bp for Read 2.

Describe the solution you'd like Please add parameter minimum read length for the read preprocessing step. I have checked the pipeline's output files and my untypical dataset is trimmed by FAQCs so much that there are no paired reads remaining after trimming step.

Describe alternatives you've considered I cannot find the minimum read length parameter in the nextflow pipeline modules so I cannot suggest any alternatives.

Additional context Please let me know how to configure the trimming step in the pipeline or add suggested parameter.

Best regards, Jan

mciprianoCDC commented 2 years ago

Hello @gaworj , The correct way to do this currently would be to include a custom config which gives additional args to faqcs, or just to edit the conf/modules.config file to include what custom configuration settings you may need.

Example:(in conf/modules.config Find this section:

    withName: 'FAQCS' {
        ext.args         = { "--debug" }
        ext.when         = {  }
        publishDir       = [
            enabled: "${params.save_alignment}",
            mode: "${params.publish_dir_mode}",
            path: { "${params.outdir}/samples/${meta.id}/faqcs" },
            pattern: "*.{fastq.gz,txt}"
        ]
    }

Then edit this line: ext.args = { "--debug" } to include additional arguments to support your input sequences (leave the debug statement, it is needed by the pipeline).

From the faqcs manual: --min_L Trimmed read should have to be at least this minimum length (default:50)

Example change: ext.args = { "--debug --min_L 15" }

Please let me know if this makes sense.

-Michael

gaworj commented 2 years ago

Hello Michael,

Thank you very much for your help and fast reply.

In the meantime I have also checked the FAQCs manual and have found proper argument for read length.

What I did on my side:

I have added min. length parameter in the modules/faqcs/main.nf config file:

[ ! -f  ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz
    FaQCs \\
        -d . \\
     --min_L 25 \\   
        -u ${prefix}.fastq.gz \\
        --prefix ${prefix} \\
        -t $task.cpus \\
        $args \\
        2> ${prefix}.fastp.log

Would it also work?

Bests, Jan

gaworj commented 2 years ago

Hello again,

The modules.config file was changed according to your instructions and now it works!

Thank you!

Jan