projectEndings / staticSearch

A codebase to support a pure JSON search engine requiring no backend for any XHTML5 document collection
https://endings.uvic.ca/staticSearch/docs/index.html
Mozilla Public License 2.0
46 stars 21 forks source link

Should we split out the diagnostics and the report? #212

Open joeytakeda opened 2 years ago

joeytakeda commented 2 years ago

Since we are hoping to get a more robust validation system in place for documents and the configuration file per #206, I think we should consider having better diagnostics for the document collection that is run before the tokenizing, etc as part of document validation. While schematron will catch most issues, we will need diagnostics to catch more complex relationships between the config and the documents. Per @martindholmes:

For instance, if create a search-in filter with an XPath that doesn't actually match anything in the document collection (as I already have, by accident, once), it would be good to know about it. Ditto if I decide to exclude a meta tag that doesn't actually exist in any of the documents. The report will catch some stuff like this, but you have to run the full build to get the complete report, so it might be better to do it in the preliminary step.

So I think it's worth keeping these two things distinct: a diagnostics that confirms your document collection and your configuration are working together as you'd expect and the report should tell you thinks that you can learn only after running staticSearch (i.e. the concordance, etc)