legumeinfo / datastore-issues

mostly for issues pertaining to the content of the legumeinfo datastore; may also relate to characteristics of its user interface or managing the mirroring process to the legfed instance
Other
1 stars 0 forks source link

Rename Wm82_ISU01 #181

Closed StevenCannon-USDA closed 9 months ago

StevenCannon-USDA commented 1 year ago

It seems that the assembly and annotation formerly known as "Wm82_ISU01.gnm2.ann1" (Data Store) or "ISU-01 v2.1" (Phytozome) will be renamed to gnm6 / v2. The leads on this project have decided that the original name was a bad choice.

Question for those of you who may have opinions about how to handle this in the Data Store: Should we mint new KEY4s for these collections? The main intended purpose of the KEY4 is to provide a way to track the data to the legumeinfo/soybase/peanutbase Data Store and the associated metadata.

My inclination is to leave the two KEY4s unchanged, but I'd like to hear if any of you think of unintended consequences. @adf-ncgr @sammyjava @maxglycine @jd-campbell

sammyjava commented 1 year ago

I don't use the KEY4 for anything. :)

adf-ncgr commented 1 year ago

I think leaving the KEY4 the same in this case is arguably the right thing to do (to indicate that it's the same dataset even though other aspects of the full yuck have changed shape). I'm guessing from what you say that the new full yuck is going to be glyma.Wm82.gnm6 and not glyma.Wm82_ISU01.gnm6? I foresee many regexp substitutes in our future...

sammyjava commented 1 year ago

ACTUALLY that's how we can indicate that a collection has had a name change! The KEY4 is the same!

(License plate analogy: moving the plate to your new car, indicating that it's the same owner inside!)

maxglycine commented 1 year ago

I do not use the KEY4 at all. So do what is most expedient.

StevenCannon-USDA commented 9 months ago

This renamed collection is finally available at Phytozome: https://phytozome.jgi.doe.gov/info/Gmax_Wm82_a6_v1 I decided to go ahead with new keys, since the new gene IDs are fundamentally different from the previous ones (there is no simple transformation between the two). The genome key is S97D and annotation key is PKSW. I'll close this issue and start a new data-preparation issue for the "new" assembly+annotations.