Illumina / BeadArrayFiles

Python library to parse file formats related to Illumina bead arrays
45 stars 33 forks source link

Small difference observed in the normalized X and Y signals between GenomeStudio vs Illumina:BeadArrayFiles #14

Closed Gerbenvandervries closed 5 years ago

Gerbenvandervries commented 5 years ago

I have a question about a small difference I observe in the normalized X and Y signals in final reports generated with GenomeStudio vs the illumina autocovert tool + python package IlluminaBeadArrayFiles::GenotypeCalls . I analyzed 96 samples on a GSA array GSAMD-24v1-0_20011747_A5. For 99.9 % of the snps the X and Y signals are exactly the same. But for about a 100 snps the X and Y signals differ, where the X_RAW and Y_RAW number are the same.

Genome Studio results: SNP X_RAW Y_RAW X Y
GSA-21:9871186 7248 2236 1.173 0.326
GSA-21:9952707 6435 2797 1.036 0.421
GSA-rs1006435 5119 2310 0.813 0.338

Illumina:BeadArrayFiles tool results for the same snps:   -- | -- | -- | -- | -- SNP | X_RAW | Y_RAW | X | Y GSA-21:9871186 | 7248 | 2236 | 1.385 | 0.433 GSA-21:9952707 | 6435 | 2797 | 1.230 | 0.542 GSA-rs1006435 | 5119 | 2310 | 0.978 | 0.448

As you can see, the X_RAW and Y_RAW number are the same, but the normalized X and Y differ. Do you have an explanation why we see this, only for a small number of snps?

KelleyRyanM commented 5 years ago

@Gerbenvandervries, The expectation is that these should be identical. Can you confirm which version of GenomeStudio and AutoConvert are used?

Gerbenvandervries commented 5 years ago

BeadArrayFiles/1.3.1 and GenomeStudio 2011.1

KelleyRyanM commented 5 years ago

Do you know which version of AutoConvert or AutoCall was used to create the GTC files?

Gerbenvandervries commented 5 years ago

version 2.0.1.179

KelleyRyanM commented 5 years ago

@Gerbenvandervries, By default, AutoConvert/AutoCall 2.0.1.179 uses an updated GenTrain algorithm (v3) that will not produce identical results to those produced by GenomeStduio 2011.1. Are you able to use the updated version of GenomeStudio (2.0) in this scenario?