Open kevinschaper opened 1 year ago
I want to add a note that I tried this out, and found that there were a lot of false negatives where linkml-validate complained about types, like nodes where the name is a number would fail for not being a string, or that single values in multivalued fields were erroneously not lists. We probably want to run as a module rather than from the cli, so that we can swallow some categories of errors - or we want to validate against a more type-defined file
The
iri
column is coming in from kg-phenio, through monarch-ingest. It's not yet defined in the schema, so Solr represents it as a multivalued column, which isn't what we want.For the moment, #474 is going out of its way to trim the
iri
field out of Solr documents to avoid problems when creating pydantic instances, and this issue is so that we don't lose track of that hack.On the monarch-ingest / linkml-solr side, we probably want to avoid passing extra fields from the tsv file to Solr. It would have probably been better to get an index-time error.
As for
iri
itself, right now we handle that expansion in via curies in the app, so if we include it, it would only be for phenio. We could also make the choice to populate it for other entities? or we could leave it out of our kg-phenio ingest, and then stick with only handling curie expansion in the code level.