Calling SV/SNP when using a bam file generated with a remora_cfg

Ask away!

Hello!

I've been generating my bam files using the following command: (I'm aware that this is now deprecated as of today?? and should use wf-basecalling)

First I create the bams, which is why I include (--cnv because that saves the files as .bam)

        BASECALL_MODEL="dna_r10.4.1_e8.2_400bps_hac@v4.2.0"
        MOD_MODEL="dorado-models/dna_r10.4.1_e8.2_400bps_hac@v4.2.0_5mCG_5hmCG@v2/"

    ./nextflow run epi2me-labs/wf-human-variation \
        -w ${OUTPUT}/workspace \
        -profile standard \
        --sample_name ${SAMPLE} \
        --mod --cnv \
        --dorado_ext pod5 \
        --fast5_dir ${POD5_DIR}/ \
        --basecaller_cfg ${BASECALL_MODEL}  \
        --remora_cfg 'custom' \
        --remora_model_path ${MOD_MODEL} \
        --ref references/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
        --bam_min_coverage 0 \
        --threads 16 \
        --out_dir ${OUTPUT} \
        -resume
}

then I run a similar command:

        BASECALL_MODEL="dna_r10.4.1_e8.2_400bps_hac@v4.2.0"

    ./nextflow run epi2me-labs/wf-human-variation \
        -w ${OUTPUT}/workspace \
        -profile standard \
        --sample_name ${SAMPLE} \
        --snp --sv \
        --basecaller_cfg ${BASECALL_MODEL}  \
        --ref references/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
        --bam_min_coverage 0 \
        --threads 16 \
        --out_dir ${OUTPUT} \
        -resume

I do this in two parts to save cost when running a GPU cloud instance for basecalling and CPU for everything else My questions are:

Is it correct to pass both a basecaller_cfg and remora_model_path to the basecaller?
When calling --snp and --sv, since I used a remora_model, how can I confirm that I'm using the best model possible for clair3?
in the initial nextflow call... i don't actually need --mod to generate the modified base calls in the bam... that's just to run modkit on the bam file later, right?

Thank you!

Fidi

epi2me-labs / wf-human-variation

Calling SV/SNP when using a bam file generated with a remora_cfg #152

Ask away!