keeps / dbptk-ui

DBPTK base UI for both Desktop and Enterprise
https://database-preservation.com
GNU Lesser General Public License v3.0
23 stars 9 forks source link

solr crashing because IllegalArgumentException is not handled #309

Closed Laurira closed 2 years ago

Laurira commented 2 years ago

When user needs to index a big siard file then it is not very good when after 20 hours of indexing solr is crasing because some date is not in the format as expected: RESTException: Remote exeption cause by GenericException: Could not convert the database. caused by IllegalArgumentException: Invalid format: "2019%02-15T07:27:21.858000Z" is malformed at "%02-15T07:27:21.858000Z". 2022-02-15 21:44:18.970 ERROR 1 --- [nio-8080-exec-4] c.d.m.siard.in.content.SAXErrorHandler : line: 2; column: 1348776609; cvc-pattern-valid: Value '2019%02-15T07:27:21.858000Z' is not facet-valid with respect to pattern '\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(.\d*)Z?' for type 'dateTimeType'.

So I think it will be reasonable when such cases will be reported but the exception is handled so that the whole indexing process will not crash.

For example in such case the date has to be replaced with dummy date for example 1900-01-01T00:00:00.000000Z or something that is correct and suitable for SOLR.

Or it should be separate validator requirement. At the moment validator did not tell that some of the dates are incorrect.

Laurira commented 2 years ago

Turned out that the input was correct and there were no illegal arguments, see #310