twolinin / longphase

GNU General Public License v3.0
99 stars 9 forks source link

[Question] Co-phase SNP from Illumina and SV from ONT #21

Closed tgong1 closed 1 year ago

tgong1 commented 2 years ago

Hi longphase team,

Thank you for this helpful software. I'm recently using longphase in our cohort.

We have SNP/indels called from Illumina short-read NGS data. So I have used Illumina SNP VCF + ONT SV VCF + ONT BAM as input for longphase and it has ran pretty well. However, I was wondering the accuracy (or switch rate) of SNP and SV phasing.

I saw you've included a comprehensive benchmarking on GIAB's data in your paper. Have you test the performance of longphase using SNP from short-read NGS data? Will using SNP called from ONT have better performance on phasing from your experience?

Thank you, Tingting

ythuang0522 commented 2 years ago

No. We didn't test Illumina SNPs. It also depends on how you called your SNPs for Illumina and ONT. You may check the evaluation by the PrecisionFDA challenges. If Dragen was used for Illumina-SNP calling, the Illumina accuracy should be better than ONT. But if they were not called by Dragen, I am not sure which one would be better than the other. I suspected only marginal differences. So you may stay with the Illumina SNPs.

tgong1 commented 2 years ago

Thank you for the quick reply. I don't think the SNP we have called by Dragen. We are considering to have a quick comparison/benchmarking. Do you mind sharing the link of GIAB phased benchmark SNP and SV of HG002, HG003 and HG004 you have mentioned in your paper with me?

Thank you, Tingting

agolicz commented 1 year ago

We're trying this out too using SNPs from bcftools. So far looks good.

twolinin commented 1 year ago

Hi @tgong1 This is the GIAB benchmark mentioned in the paper.

URLs of GIAB v4.2.1 benchmark for HG002/HG003/HG004 HG002 https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh37/HG002_GRCh37_1_22_v4.2.1_benchmark.vcf.gz
HG003 https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG003_NA24149_father/NISTv3.3.2/GRCh37/HG003_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X_CHROM1-22_v.3.3.2_highconf.vcf.gz HG004 https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG004_NA24143_mother/NISTv3.3.2/GRCh37/HG004_GRCh37_GIAB_highconf_CG-IllFB-IllGATKHC-Ion-10X_CHROM1-22_v.3.3.2_highconf.vcf.gz

URLs of CMRGv1.00 curated SVs https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG002_NA24385_son/CMRG_v1.00/GRCh37/SupplementaryFiles/HG002v11-align2-GRCh37/HG002v11-align2-GRCh37.dip.vcf.gz

tgong1 commented 1 year ago

Many thanks for the help!