Closed sammyjava closed 1 year ago
Just with respect to the "TT_" pre-prefix: that is legacy from ~7-8 years ago, when we were trying to distinguish tetraploid, A-genome diploid, and B-genome diploid maps. I favor dropping that. I am also OK with the more extreme shortening of LG names to e.g. "LGS1" ... as long as we have a way of uniquely identifying the maps themselves. Unfortunately, it is not uncommon for multiple maps to be generated from the same parents -- hence the _a and _b suffixes.
Yup, the genetic map files have the genotypes.map._author1_author2year naming (with a, b suffixes for the same genotypes and authors and years, typically from the same publication).
So, for example, the BAT93_x_JALOEEP558.map.Caldas_Blair_2009.lg.tsv file (for which I need to track down markers) will have linkage groups named:
#linkage_group length
B04 94.35
B06 74.79
B07 67.82
B08 104.23
B10 79.23
which differ from, say, the time-honored BAT93_x_JALOEEP558.map.Freyre_Skroch_1998.lg.tsv:
#linkage_group length
B01 107
B02 175
B03 132
B04 95
B05 72
B06 113
B07 109
B08 133
B09 105
B10 89
B11 100
So we always consider a linkage group along with the genetic map from which it came, which is in its filename.
This has been implemented.
I think I am responsible for the rather yucky names for linkage groups like: TT_SunOleic97R_x_NC94022_a-LGS1 which have shown themselves to result in self-inconsistencies plus they're inconsistent with how we're naming maps these days. This came from exporting from chado years ago and wanting to have unique LG identifiers.
I have now done a mine model update which keys LGs on (identifier,geneticMap) so "LGS1" can be used for different LGs from different maps. I think this is good because there tends to be a general consensus on what "LGS1" or "B01" means amongst the genetics community (typically corresponding to chromosome 1, but not always, of course).
So, in reference to https://github.com/legumeinfo/datastore-issues/issues/119 I'd like to rename the LGs throughout the DS to the short names used in the publications, given that I'm keeping them distinct in the mines via their geneticMap reference. I think this will make the mines more genetic-user friendly.
Objections, @cann0010 @adf-ncgr @svengato ?