Closed dannylamb closed 3 years ago
Here's the blueprint xml that's failing
<?xml version="1.0" encoding="UTF-8"?>
<!-- managed by ansible -->
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:cm="http://aries.apache.org/blueprint/xmlns/blueprint-cm/v1.1.0"
xsi:schemaLocation="
http://aries.apache.org/blueprint/xmlns/blueprint-cm/v1.1.0 http://aries.apache.org/schemas/blueprint-cm/blueprint-cm-1.1.0.xsd
http://www.osgi.org/xmlns/blueprint/v1.0.0 http://www.osgi.org/xmlns/blueprint/v1.0.0/blueprint.xsd
http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">
<cm:property-placeholder id="properties" persistent-id="ca.islandora.alpaca.connector.ocr" update-strategy="reload" >
<cm:default-properties>
<cm:property name="error.maxRedeliveries" value="http://hypercube:8000"/>
<cm:property name="in.stream" value="broker:queue:islandora-connector-ocr"/>
<cm:property name="derivative.service.url" value="http://hypercube:8000"/>
</cm:default-properties>
</cm:property-placeholder>
<reference id="broker" interface="org.apache.camel.Component" filter="(osgi.jndi.service.name=fcrepo/Broker)"/>
<bean id="http" class="org.apache.camel.component.http4.HttpComponent"/>
<bean id="https" class="org.apache.camel.component.http4.HttpComponent"/>
<camelContext id="IslandoraConnectorOCR" xmlns="http://camel.apache.org/schema/blueprint">
<package>ca.islandora.alpaca.connector.derivative</package>
</camelContext>
</blueprint>
Key part is this <cm:property name="error.maxRedeliveries" value="http://hypercube:8000"/>
.
Looks like maxRedeliveries is being set to the microservice URL! :man_facepalming: Well at least that one's easy to fix.
Now I'm getting 403's because the nginx user can't read syn-settings.xml
. Spiraling out a bit, but this feels manageable still.
not sure if this is the same problem I can't get the OCR to work with 1.0.0-alpha-10
alpaca_1 | 2021-11-10 15:41:08,521 | ERROR | a-connector-ocr] | DefaultErrorHandler | 56 - org.apache.camel.camel-core - 2.20.4 | Failed delivery for (MessageId: ID-6b10eef716fa-1636558559020-2-22 on ExchangeId: ID-6b10eef716fa-1636558559020-2-12). Exhausted after delivery attempt: 11 caught: org.apache.camel.http.common.HttpOperationFailedException: HTTP operation failed invoking https://dc.library.txstate.edu/node/4257/media/extracted_text/1 with statusCode: 500. Processed by failure processor: FatalFallbackErrorHandler[Channel[Log(ca.islandora.alpaca.connector.derivative.DerivativeConnector)[Error connecting generating derivative with http://hypercube:8000: ${exception.message}
Got reports twice today (!) from two different organizations that Hypercue was not extracting text from images or PDFs. Did some digging and found this in the Alpaca logs:
Not sure why it's expecting an integer instead of a string for the microservice URL, but it is. It looks like an issue in the blueprint xml that gets deployed, but that's just speculation on my part. This warrant further investigation.
Would be good to get confirmation from someone running the playbook. If this is an ISLE only problem, that'd be relieving to know.