gbv / jskos-server

Web service to access JSKOS data
https://coli-conc.gbv.de/api/
MIT License
6 stars 4 forks source link

Validate mappings during import #3

Closed nichtich closed 5 years ago

nichtich commented 6 years ago

Mappings should be validated with a JSON Schema on import. MongoDB supports schema validation for this purpose. A more complete JSON Scheme for JSKOS is required anyway (https://github.com/gbv/jskos/issues/15) anyway but a subset for mapping data should be enough.

nichtich commented 6 years ago

An additional constraint is to have only one single concept in the from member set. memberList should neither be supported.

stefandesu commented 6 years ago

For this, we should use jskos-validation (or rather a future jskos-tools package that integrates this). MongoDB's schema validation is different from JSON Schema and there's no easy way to convert between the two, so I'd suggest to only use JSON Schema (via jskos-validation).

nichtich commented 6 years ago

jskos-tools is at https://github.com/gbv/wikidata-jskos/tree/master/lib/jskos and at https://github.com/gbv/cocoda/tree/dev/src/util, so this should be published as independent npm module first. jskos-validation would depend on jskos-tools so applications can chose to include validation (large dependencies) or not.

stefandesu commented 6 years ago

I would suggest not only to publish jskos-tools as an npm module, but also to move those tools to a separate repository (maybe that was implied by your comment). jskos-validation has really small dependencies actually, and it works separately from jskos-tools, that's why my original thought was to include it in jskos-tools (so that clients only have to include jskos-tools and not both packages). But I'd be fine with them being separate packages as well.

nichtich commented 6 years ago

Now we have https://github.com/gbv/jskos-tools, can this be closed?

stefandesu commented 6 years ago

Now we have https://github.com/gbv/jskos-tools, can this be closed?

Mappings are not actually validated during import yet. We also haven't decided what should happen when a mapping (or concept, ...) fails validation during import. For example, a lot of the mappings still use a string instead of a set for type and will therefore fail validation. Right now, validation only tells us that something is wrong with the mapping, but not where exactly the error is.

We could deal with the most common errors (like the type thing) before validation and then discard all mappings that fail validation (and log the error of course).

nichtich commented 6 years ago

It should be enough i import emits error messages with input line number of imported NDJSON file so we can inspect the failing records by hand.

stefandesu commented 5 years ago

This is still not implemented. Should we require this for version 1.0.0, or could we move this to a later point in time?

nichtich commented 5 years ago

This does not need to be implemented in the next release -- Jakob Voß via Android

stefandesu commented 5 years ago

jskos-validate is already used for import validation (for all types, not only mappings): https://github.com/gbv/jskos-server/blob/eb49e44056d58d0ba129b32f6392d1dd94d834c8/bin/import.js#L310-L314

Validation errors are logged to console as warnings, the item is skipped, and the import will continue.

I must have forgotten about this, I guess I added it when rewriting the import script.