Closed ggstatgen closed 5 years ago
I have no GRCh37 version of the SV calls. It would not surprise me if many of the SV hotspots were changed between GRCh36 and GRCh37, and so liftovers probably won't work there.
I assume you are losing more deletions in a liftover than you are insertions because insertions fall on one point, and deletions span many bases. You might be able to break deletions into smaller regions and lift those, but the results may not be fully correct if the reference scaffolds changed significantly.
The VCF was generated from the BED after merging and analysis. I don't know if that helps.
You could try lifting the BED with UCSC liftover and see if that gets more of them. Other than that, I don't have much good advice for this.
Hi Peter
Bit of a long shot (and not a bug in SMRTsv2 so by all means feel free to move this where pertinent) but I was wondering if you had a GRCh37/hg19 version of the callset in your recent paper available (99k SMRTsv/2 calls from the 15 samples).
It appears the GATK liftover has significant problems with the remapping and we're losing too many calls to consider the option feasible.
I've also been thinking of lifting over your bed file here
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/hgsv_sv_discovery/working/20181025_EEE_SV-Pop_1/VariantCalls_EEE_SV-Pop_1/EEE_SV-Pop_1.ALL.sites.20181204.bed.gz
But I don't know if your vcf calls were converted to minimal form before vcf->bed conversion.
Could you tell me if this was the case?
Best wishes,