Closed mjordan closed 7 years ago
I tracked the problem down (unnecessary checking of a preg_match()'s $matches within a loop) and fixed it.
In the process of debugging this, I discovered another, unrelated problem: some metadata manipulators throw an exception when the XML snippet they are processing has no length, writing the exception message "DOMDocument::loadXML(): Empty string supplied as input" to the mik.log. By checking for a length on their input and returning the input if the length is 0 we avoid those mik.log entries. With this check in place, the output MODS XML is complete and validates, so I would consider those lines more annoyances than anything else.
@MarcusBarnes OK to add the fix to the second problem into the same PR that fixes this first bug?
@mjordan Please add the two fixes together. Thank you in advance for this work.
Great, just running another job on the 11k objects. So nice to have a clean mik.log and valid MODS for each one! If my QA on the MODS finds no issues I'll open a PR.
Closed.
Addressed in pull-request https://github.com/MarcusBarnes/mik/pull/328 (committed with https://github.com/MarcusBarnes/mik/commit/7ce0163c80d2d5b7f9143a5cba7ddb2d97254051).
Doing some pre-ingest tests with a large collection (11k CSV objects) and I'm finding that the SplitRepeatedValues metadata manipulator is reporting a failed regex about 50% of the time. Eyeballing the failed values doesn't reveal any obvious errors to me. Will investigate.