Separate OMOP Vocabulary data model?

chop-dbhi / data-models

Collection of various biomedical data models in parseable formats.

https://data-models-service.research.chop.edu

28 stars 8 forks source link

Separate OMOP Vocabulary data model? #135

Open gracebrownecodes opened 8 years ago

gracebrownecodes commented 8 years ago

@bruth @murphyke @burrowse Do you think we should try to pull the vocabulary model out of the OMOP and PEDSnet models into its own directory? This would reduce duplication and also allow our automated tools to handle each differently, without requiring a significant change to the format (as implementing "tags" would).

One potential problem is that there are foreign key relations between the data and vocabulary tables which I'm not sure how the downstream JSON service, DDL service, etc would handle... I could certainly give it a shot in a branch and see what happens when the tests get run.

bruth commented 8 years ago

If all that was required was to combine multiple models into one logical model, then a method would need to be added to the API to take a list of models and effectively merge the models together. Common things like name collisions could be solved with prefixing the model name.

If a model depends on another model, then that would need to be expressed in the files. Right now the references.csv file assumes the ref_table and ref_field exist on the same model. Adding two more columns ref_model and ref_version would allow for specifying a separate model. This would be an implicit dependency and would include the dependents model automatically.

gracebrownecodes commented 8 years ago

I was imagining the second solution. The goal being to separate related but orthogonal pieces of a model and allow the downstream packages to handle the dependency. Here's the JSON Table Schema spec on the matter: http://dataprotocols.org/json-table-schema/index.html#foreign-keys