CNV and coordinates mapping issue?

Hello @freeseek

Is this a known issue that coordinates for CNV loci may be incorrectly/differently presented via SourceSeq mapping workflow?

Since I have large set of chips and samples to analyze, I tried to estimate accuracy of coordinate inference via SourceSeq mapping. I selected a couple chips for which I have hg19-based manifests with both RefStrand column and SourceSeq column. So I could generate the vcf in a stanadard fashion and liftover the vcf to new hg38 ("The liftover approach"). or I could use the SourceSeq column to update manifests to hg38 and the resulting vcf would be based on hg38 ("freeseek/gtc2vcf plugin approach"). I prefer updating manifests to hg38 since it could be useful in absence of RefStrand column and somewhat a more straightforward solution.

Here are the results:

I note that majority of the inconsistency between the liftover approach verses plugin were associated with the CNV loci. Is this a known issue? Do you have any thoughts or possible suggestions that to make CNV coordinates more consistent?

Thanks.

freeseek / gtc2vcf

CNV and coordinates mapping issue? #64