Illumina / GTCtoVCF

Script to convert GTC/BPM files to VCF
Apache License 2.0
41 stars 31 forks source link

Illumina Generated Manifest has no RefStrand column #49

Open tbrunetti opened 4 years ago

tbrunetti commented 4 years ago

I have no problems running this using BPM, however, when using an Illumina provided manifest (.csv) it is missing the RefStrand header. These are the headers that were provided, can you tell me which one maps to RefStrand (I want to say it is likely SourceStrand, but I would like confirmation before changing a header name; if is is SourceStrand, should I update all SourceStrand embedded headers to be RefStrand)?:

IlmnID, Name, IlmnStrand, SNP, AddressA_ID, AlleleA_ProbeSeq, AddressB_ID, AlleleB_ProbeSeq, GenomeBuild, Chr, MapInfo, Ploidy, Species, Source, SourceVersion, SourceStrand, SourceSeq, TopGenomicSeq, BeadSetID

jjzieve commented 4 years ago

Which CSV manifest are you using? It should have a RefStrand column if it was created relatively recently. It is not the same as the SourceStrand.

danilovkiri commented 3 years ago

@jjzieve

Hi, I have exactly the same problem. The manifest is old (dated 2014), though I have to use it now to process old genotyping data. Is there a way (algorithmically) to assign RefStrand values based on whatever sequences can be found in SourceSeq/TopGenomicSeq/etc?

jjzieve commented 3 years ago

@danilovkiri You could try using the SourceSeq and comparing to the ProbeSeq values if you want, but I wouldn't recommend it. The way its done internally validates against the reference genome in case the SourceSeq isn't accurate (this can occur on custom designs). I would email techsupport@illumina.com and ask to get the RefStrand mapped on your manifest.

danilovkiri commented 3 years ago

Thank you very much @jjzieve

jjzieve commented 3 years ago

@danilovkiri No problem! Let me know if that works out for you. If so, I'll close this issue.