timknut / geno_imputation

Documentation and code base for the Geno/Roslin imputation project
2 stars 2 forks source link

Nordic_54k_2012_ed1_markerlist #7

Closed Unoqualsiasi closed 7 years ago

Unoqualsiasi commented 7 years ago

The markers of this file have some problems. Someone edited them before.

Example : marker original name: ARS-USMARC-Parent-AY761135-RS29003723

Noric_54k_2012_ed1 name: ARS-USMARC-PARENT-AY761135-RS29003723

This problem allows the compatibility of only 34732 markers over 52259. I will try to fix it.

UPDATE: i was able to recover the information

The changes i made in Noric_54k_2012_ed1 are:

Parent instead of PARENT -no instead of -NO -rs instead of -RS _Contig instead of _CONTIG Hapmap instead of HAPMAP

You can close the issue 🎱

Paolo

timknut commented 7 years ago

That is stupid.. Good catch! Maybe I can make the pattern matching CASE insensitive. Will keep this open till I check. @haraldgrove

haraldgrove commented 7 years ago

For what it's worth, I would be careful with adding an automatic converting to any translation script. Either have it as part of an initial QC or at least try and give some feedback to the user that the format is not 100% identical. (Even if capitalization is a fairly minor issue). The main point being that any differences might be an indication that someone has done some editing at some point and there might be more.

argju commented 7 years ago

We'll take our chances on this ;-)