dcppc / crosscut-metadata

7 stars 6 forks source link

Example doesn't zero-pad UBERON IDs #5

Closed cmungall closed 6 years ago

cmungall commented 6 years ago

https://github.com/dcppc/crosscut-metadata/blob/master/dats-json-examples/GTEx-Material-pattern-stub.json#L63

CURIE should be zero-padded to 7 digits

s/UBERON:1234/UBERON/:0001234

jonathancrabtree commented 6 years ago

FWIW, I noticed this and corrected it in the actual transformation/instance (i.e., the DATS JSON files that appear in https://github.com/dcppc/crosscut-metadata/tree/master/dats-json):

dats-json jcrabtree$ gzcat TOPMed_phs000946_wgs_public.json.gz gtex_v7_rnaseq_public.json.gz | perl -ne 'if (/(UBERON:\d+)/) { print "$1\n"; }' | sort | uniq UBERON:0000007 UBERON:0000178 UBERON:0000458 UBERON:0000473 UBERON:0000945 UBERON:0000992 UBERON:0000995 UBERON:0000996 UBERON:0001114 UBERON:0001150 UBERON:0001157 UBERON:0001159 UBERON:0001211 UBERON:0001225 UBERON:0001255 UBERON:0001323 UBERON:0001496 UBERON:0001621 UBERON:0001870 UBERON:0001873 UBERON:0001874 UBERON:0001876 UBERON:0001882 UBERON:0001898 UBERON:0001954 UBERON:0002037 UBERON:0002038 UBERON:0002046 UBERON:0002106 UBERON:0002190 UBERON:0002367 UBERON:0002369 UBERON:0003889 UBERON:0004264 UBERON:0004550 UBERON:0004648 UBERON:0006330 UBERON:0006469 UBERON:0006566 UBERON:0006631 UBERON:0006920 UBERON:0007610 UBERON:0008367 UBERON:0008952 UBERON:0009834 UBERON:0009835 UBERON:0010414 UBERON:0011907 UBERON:0012249 UBERON:0013756 UBERON:0036149

The transform is based on the dats-json-example files but may differ slightly from them in cases like this where I noticed something that didn't look right or was inconsistent with other examples. I'll leave the issue open in case @proccaserra or @agbeltran wants to update the example files. I'm not sure whether they should be considered merely an early work product or an evolving up-to-date (partial) specification of the metadata model itself (versus the instance.)