translate csv data to map to current model and its relevant attributes

brain-bican / models

BICAN data models

https://brain-bican.github.io/models/

3 stars 3 forks source link

translate csv data to map to current model and its relevant attributes #3

Closed sooyounga closed 1 year ago

sooyounga commented 1 year ago

based on provided small data set of 20230412, the columns are mapped to these attributes in our model:

genome_annotation_label --> synonym
gene_identifier_prefix --> id prefix (stylized)
gene_local_unique_identifier --> id suffix
symbol --> symbol
name --> name
gene_biotype --> NOT mapped

translator.py is first draft of how we can take such data in csv format and load it into yaml data format to load into our model class

satra commented 1 year ago

let's add a slot in our model for biotype called molecular type and lets' make the range of that slot be an enum with two values: protein-coding and noncoding.

also once the other csv mapping is done, let's use that to add values to the in_taxon and referenced in slots.

for the other csvs, we want to collate and map that information into GenomeAnnotation, whose id would then by used for the value of referenced in.

satra commented 1 year ago

is this rebased on current master? seems like a lot of unrelated files have been updated.