Closed crism closed 9 years ago
Got it. Let me have a look at your rewrite a little later on and I'll see if anything seems obviously wrong to me ... glad we're so close, though!
Based on a small sample so far, documents that go through the reference extraction pathway hang at merge
. Documents that fail reference extraction do not.
Huh! OK, wild guess -- when is ParsCit firing? immediately after meTypeset? do we want to try punting that to after merge?
if we needed to, could we force a fail on the reference extraction to get all the documents through and see the front matter parse results? Not saying we should do that yet, obviously this warrants some exploration of what might turn out to be an easy fix.
The queues are now completely linear. ParsCit looks at the NLM XML, and makes its own output. Its success or failure ($job->referenceParsingSuccess
) is used as a flag in the queue manager to make path decisions later on. I am pretty sure the problem comes when MergeXMLOutputs
tries to find the appropriate NLM XML output—as modified by BibtexreferencesConversion
or not—but I would have expected an Exception
and failure, if I’d gotten that wrong. I’ll look at this more after dinner.
The results, @jalperin, are fine when it succeeds; combining two XML documents in this way is extremely straightforward.
Oh. If it’d been a snake, it’d bit me.
$meTypesetDocument = $job->getStageDocument(JOB_CONVERSION_STAGE_NLMXML);
That should have a conditional for the reference extraction success; that document’s stage is changed when the references are updated, and is no longer accessible by that handle.
This seems to be fixed.
Fantastic. I've been having pretty bad insomnia this week (like right now) but very much looking forward to testing in the morning.
Integrate https://github.com/CeON/CERMINE as a module.