epi2me-labs / wf-human-variation

Other
86 stars 41 forks source link

specified genome fasta is not used for alignment #177

Closed flokraft85 closed 1 month ago

flokraft85 commented 1 month ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

2.1.0

Workflow Execution

Command line (Local)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

nextflow run wf-human-variation --bam /in.bam --sample_name name --out_dir /out --ref /data/ref/GRCh38_no_alt_analysis_set_dup_masked.fa --snp --sv --cnv --str --mod --annotation true --phased --tr_bed /human_GRCh38_no_alt_analysis_set.trf.bed --GVCF -profile standard -w /work/ -c increase_memory.config --sex female

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

wf-human-variation use another genome build for alignment then specified with the --ref option. I'm interested in methylation and CNV analysis of the H19 locus. However, using the wf-human-variation workflow produced alignment files, do not allow analysis of this region. Why? Because the alignment step use a hg38/GRCh38 genome build were false duplicated regions are not masked. (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-02863-7) The H19 locus is false duplicated on the chr11_KI270721v1_random contig, which leads to many reads in this region with mapping quality 0, if not masked.

The wf-human-variation result show exactly this behavior. grafik However, mapping the same reads with minimap2 (minimap2 -ax lr:hq --MD /GRCh38_no_alt_analysis_set_dup_masked.fa in.fastq.gz) against the with --ref specified genome (which is a false duplication masked GRCh38 version), I got this result. grafik

So why wf-human-variation did not use the specified ref genome fasta file? And why a version of hg38/GRCh38 without masked false duplications is used. Thanks in advance for clarifying the issue.

Best, Florian

Relevant log output

n/a

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

flokraft85 commented 1 month ago

OK, I found the issue. wf-human-variation does not re-align the reads, if a aligned BAM is used as input. So it my fault. Sorry. the issues could be closed.