virusseq / metadata-schemas

Repo to host Virus-seq metadata schemas defined using JSON Schema
GNU Affero General Public License v3.0
4 stars 3 forks source link

Change metadata schema for sample collection date #17

Open nithujohn opened 3 years ago

nithujohn commented 3 years ago

Is that possible to change the sample collection date format in the metadata schema for samples which doesnot have a day but has has only month and year ?

scottcain commented 2 years ago

Here's a little more info on this request: some of the older data only has a month and year for their collection date. While these data are loaded into the data store with a fake day of the month (ie, on the 1st of the month), this should be fixed since it conflicts with GISAID (which allows just month and year), in addition to the fact that the fake date is just, you know, wrong. So, what needs to happed to allow this? Presumably, we'd have to make a change to the schema, and then with a list of affected sequences, delete the fake first of the month day entries. Anything I'm missing?

b-f-chan commented 2 years ago

Breaking change, we will need to analyze and implement as part of the new work packages, not considered a bug

scottcain commented 2 years ago

After an email exchange with Fiona, here is my suggested approach, which should not be too onerous: all sequences with a missing day in their date, will have a day of 1 (first of the month) added to each of them and have sample_collection_date_precision added to them (which is already part of the metadata schema as an enum) with the value set to month. Before adding these old data with missing days, the other entries in the database would have to be updated so that the sample_collection_date_precision is set to day. A small amount of work for OICR (updating the sample_collection_date_precision entries for existing data) and some work for curators (adding the fake day and adding sample_collection_date_precision entries, as well as adding those entries for data going forward.

b-f-chan commented 2 years ago

@sifavahora - Any update on the priority and solution for this ticket? Not urgent, but just helping plan Bioinfo work and this ticket mysteriously showed up on the board. Thanks!

ghost commented 2 years ago

No updates so far. Could'nt locate this on my priority list. But will check and get back to you soon.