Open kevinschaper opened 1 year ago
Here is an example of the output that we're making from monarch-py right now, and what's particularly unfortunate is that we end up wrapping an empty string in a list. The nulls below are actually from columns that are defined in the data model, but don't exist in the database - which seems like a different (but also important) problem.
{
"aggregator_knowledge_source": [
"infores:monarchinitiative"
],
"id": "uuid:6c5acfe3-9a46-11ed-bf1e-791522c88a3d",
"subject": "MONDO:0012933",
"original_subject": "OMIM:612555",
"subject_namespace": null,
"subject_category": [],
"subject_closure": [],
"subject_label": null,
"subject_closure_label": [],
"predicate": "biolink:has_phenotype",
"object": "HP:0100615",
"original_object": null,
"object_namespace": null,
"object_category": [],
"object_closure": [],
"object_label": null,
"object_closure_label": [],
"knowledge_source": [
""
],
"primary_knowledge_source": [
"infores:hpoa"
],
"category": [
"biolink:DiseaseToPhenotypicFeatureAssociation"
],
"negated": null,
"provided_by": "hpoa_disease_phenotype_edges",
"publications": [
"OMIM:612555"
],
"qualifiers": [
""
],
"frequency_qualifier": null,
"has_evidence": "ECO:0000501",
"onset_qualifier": null,
"sex_qualifier": null,
"source": null,
"stage_qualifier": null,
"pathway": null,
"relation": null
}
For consistency with Solr, matching monarch-py output from different sources, and general good database practices, we need to replace our empty strings with null.
We need to do it for all of the columns in all of the tables. Ideally, there should be a way to do this in bulk.