freeseek / gtc2vcf

Tools to convert Illumina IDAT/BPM/EGT/GTC and Affymetrix CEL/CHP files to VCF
MIT License
140 stars 24 forks source link

bpm with an alternate reference csv_manifest simultanously? #67

Closed rajwanir closed 3 months ago

rajwanir commented 3 months ago

Hello @freeseek ,

I wanted to check if I can use the bpm_manifest and a csv_manifest (with an alternate reference, prepared as suggested in the documentation) simultaneously without conflict?

Typically, I am fine with a csv_manifest only (with an alternate reference), however, for some downstream analysis I need the normalized intensities imported into the VCF. For this reason, I see that I must include the bpm_manifest. Since I am using csv_manifest and bpm_manifest with different references, do you see this as conflict or providing any incorrect results? Or it is safe to do so because If both manifests are provided, it primarily uses csv_manifest for coordinates etc and uses bpm_manifest to obtain normalized intensities only?

Thanks

freeseek commented 3 months ago

When you use both bpm and csv files together, the bpm file is only used for computing normalized intensities. Notice that this is the recommended approach as without the bpm file it is impossible to compute BAF and LRR values and the values computed by the BCFtools/gtc2vcf plugin are superior to those computed by the Illumina/gencall tool as the BAF values from BCFtools/gtc2vcf are not truncated between 0 and 1, which improves downstream performance a little bit

Notice also that you can input the bpm, the csv, and a sam/bam file at the same time and if you do so coordinates will be computed from the sam/bam file. You do not need to create additional csv files to run the BCFtools/gtc2vcf plugin

rajwanir commented 3 months ago

Thank you so much.