Since we are hoping to get a more robust validation system in place for documents and the configuration file per #206, I think we should consider having better diagnostics for the document collection that is run before the tokenizing, etc as part of document validation. While schematron will catch most issues, we will need diagnostics to catch more complex relationships between the config and the documents. Per @martindholmes:
For instance, if create a search-in filter with an XPath that doesn't actually match
anything in the document collection (as I already have, by accident,
once), it would be good to know about it. Ditto if I decide to exclude a
meta tag that doesn't actually exist in any of the documents. The report
will catch some stuff like this, but you have to run the full build to
get the complete report, so it might be better to do it in the
preliminary step.
So I think it's worth keeping these two things distinct: a diagnostics that confirms your document collection and your configuration are working together as you'd expect and the report should tell you thinks that you can learn only after running staticSearch (i.e. the concordance, etc)
Since we are hoping to get a more robust validation system in place for documents and the configuration file per #206, I think we should consider having better diagnostics for the document collection that is run before the tokenizing, etc as part of document validation. While schematron will catch most issues, we will need diagnostics to catch more complex relationships between the config and the documents. Per @martindholmes:
So I think it's worth keeping these two things distinct: a diagnostics that confirms your document collection and your configuration are working together as you'd expect and the report should tell you thinks that you can learn only after running staticSearch (i.e. the concordance, etc)