Closed sammyjava closed 1 month ago
Yes, I will update the specs. These are just maps from the original ids to the LIS ids; they may often be trivial (just adding the full yuck prefixing), but I figured we might decide that having: Lcu.2RBY.1g000010 -> lencu.CDC_Redberry.gnm2.ann1.1g000010 would be better than Lcu.2RBY.1g000010 -> lencu.CDC_Redberry.gnm2.ann1.Lcu.2RBY.1g000010 although I did not actually do so in this case (having meant to RFO it and then getting distracted). But more and more groups are starting to add their own flavors of yuck, and the id_maps can be updated and used to re-transform the files programatically if we decide to start making such aesthetic decisions.
I think I support dots in feature names, but we'll find out with this build. I just looked at the method that extracts the secondaryIdentifier from a primaryIdentifier and it does.
I've added specifications for featid_map.tsv and seqid_map.tsv, in
Genus/species/annotations
and
Genus/species/genomes
Currently, we have examples here:
Vigna/radiata/annotations/VC1973A.gnm7.ann1.RWBG/vigra.VC1973A.gnm7.ann1.RWBG.featid_map.tsv.gz
Vigna/radiata/genomes/VC1973A.gnm7.SB53/vigra.VC1973A.gnm7.SB53.seqid_map.tsv.gz
Glycine/max/annotations/Wm82_ISU01.gnm2.ann1.FGFB/glyma.Wm82_ISU01.gnm2.ann1.FGFB.featid_map.tsv.gz
Glycine/max/genomes/Wm82_ISU01.gnm2.JFPQ/glyma.Wm82_ISU01.gnm2.JFPQ.seqid_map.tsv.gz
Whenever a new file type appears in the DS I'll post up an issue, since it likely is not in the specification. This has happened with .id_map files:
Do these need to be in the datastore?