replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

update guppy #204

Open replikation opened 2 years ago

replikation commented 2 years ago

new guppy version 6 does not require arrangement files for demultiplexing and has some changes

demulti like this

process collect_fastq {
        label 'demultiplex'
    input:
        tuple val(name), val(technology), path(dir)
    output:
        tuple val(name), path("*.fastq.gz"), emit: reads
    script:
        if (params.single)
        """
        find -L ${dir} -name '*.fastq' -exec cat {} +  | gzip > ${name}.fastq.gz
        find -L ${dir} -name '*.fastq.gz' -exec zcat {} + | gzip >> ${name}.fastq.gz
        """
        else if (!params.single)
        """
        BARCODE_DIRS=\$(find -L ${dir} -name "barcode??" -type d)

        if [ -z "\${BARCODE_DIRS}" ]; then 
            guppy_barcoder -t ${task.cpus} -r -i ${dir} -s fastq_outbreak  \
                    --detect_mid_strand_barcodes \
                    --min_score_barcode_mid 50 \
                    --trim_adapters \
                    --trim_barcodes \
                    --disable_pings  

            for barcodes in fastq_outbreak/barcode??; do
                find -L \${barcodes} -name '*.fastq' -exec cat {} + | gzip >> \${barcodes##*/}.fastq.gz
            done
        else
            for barcodes in \${BARCODE_DIRS}; do
                find -L \${barcodes} -name '*.fastq' -exec cat {} + | gzip >> \${barcodes##*/}.fastq.gz
                find -L \${barcodes} -name '*.fastq.gz' -exec zcat {} + | gzip >> \${barcodes##*/}.fastq.gz
            done
        fi
        """
        stub:
        """
        touch ${name}.fastq.gz
        """
}
replikation commented 2 years ago

also needs to be updated with the most recent medaka - artic version and model

hoelzer commented 1 year ago

I want to bump this issue because we now have a Medaka update in the ARTIC process. It would be good to have updated Guppy so people can use the R1041 models for basecalling and the matching Medaka models.

@replikation should be relatively straight-forward, or? We should just do proper testing where we can also assist.

replikation commented 1 year ago

they changed some flags with the new guppy for adapter trimming so it not be a drop in replacment

hoelzer commented 1 year ago

I see, so would need proper adjustment and testing for all the different supported settings. But

docker pull nanozoo/guppy_gpu:6.4.6-1--2c17584

seems quite recent for testing.