wwelling / fcrepo-camel-toolbox

A collection of ready-to-use messaging applications with fcrepo-camel
Apache License 2.0
0 stars 0 forks source link

Solr indexing default transform not working for ldp:contains #2

Open wwelling opened 2 months ago

wwelling commented 2 months ago

What - description of what you me to do

Add custom camel route processor to replace all instances of string literal ldp:contains with ldp:ccontains. Update org.fcrepo.camel.indexing.solr.SolrRouter "direct:update.solr" camel route to process using new custom processor before sending to direct:send.to.solr route.

Why - explain why this is important

This is important to resolve an inability of the XSLT of RDF+XML with the XSL is not able to select rdf:RDF/rdf:Description/ldp:contains/@rdf:resource.

codeautopilot[bot] commented 2 months ago

Potential solution

The task involves creating a custom processor to modify the message body within a Camel route. The processor will search for instances of the string ldp:contains and replace them with ldp:ccontains. This is necessary because the XSLT transformation is unable to select the desired elements due to the presence of ldp:contains. By replacing it with ldp:ccontains, the XSLT should be able to perform the selection as intended.

How to implement

  1. Create the Custom Processor:

    • Create a new Java class named CustomProcessor in the package org.fcrepo.camel.indexing.solr.
    • Implement the org.apache.camel.Processor interface in this class.
    • Write the process method to replace ldp:contains with ldp:ccontains in the message body.
  2. Modify the SolrRouter:

    • In the SolrRouter.java file, locate the route that starts with from("direct:update.solr").
    • Insert the .process(new CustomProcessor()) step before the .to("direct:send.to.solr") step.

Here's the code for the CustomProcessor.java:

package org.fcrepo.camel.indexing.solr;

import org.apache.camel.Exchange;
import org.apache.camel.Processor;
import org.apache.camel.Message;

public class CustomProcessor implements Processor {

    @Override
    public void process(Exchange exchange) throws Exception {
        Message in = exchange.getIn();
        String body = in.getBody(String.class);

        String modifiedBody = body.replaceAll("ldp:contains", "ldp:ccontains");

        in.setBody(modifiedBody);
    }
}

And the modified section of the SolrRouter.java file:

...
from("direct:update.solr").routeId("FcrepoSolrUpdater")
    .log(LoggingLevel.INFO, logger, "Indexing Solr Object ${header.CamelFcrepoUri}")
    .setHeader(INDEXING_URI).simple("${header.CamelFcrepoUri}")
    .filter().simple("${header.CamelIndexingTransformation} != ${header.CamelIndexingUri}")
    .choice()
        .when(header(INDEXING_TRANSFORMATION).isNotNull())
            .log(LoggingLevel.INFO, logger,
                "Sending RDF for Transform with XSLT from ${header.CamelIndexingTransformation}")
            .toD("xslt:${header.CamelIndexingTransformation}")
            .process(new CustomProcessor()) // Custom processor added here
            .to("direct:send.to.solr")
        .when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo("")))
            .log(LoggingLevel.INFO, logger,"No Transform supplied")
            .process(new CustomProcessor()) // Custom processor added here as well
            .to("direct:send.to.solr")
        .otherwise()
            .log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}");
...

Make sure to import the CustomProcessor class at the top of the SolrRouter.java file:

import org.fcrepo.camel.indexing.solr.CustomProcessor;

After implementing these changes, the Camel route will replace ldp:contains with ldp:ccontains before sending the message to Solr, which should allow the XSLT to function correctly.

Click here to create a Pull Request with the proposed solution

Files used for this task:

Changes on fcrepo-indexing-solr/src/main/java/org/fcrepo/camel/indexing/solr/SolrRouter.java To address the issue described in the ticket, we need to modify the `SolrRouter.java` file to include a custom processor that replaces all instances of the string literal `ldp:contains` with `ldp:ccontains`. This processor should be added to the route that handles updates to the Solr index, specifically before the `direct:send.to.solr` endpoint. Here's a practical proposal for how to solve this task: 1. **Create the Custom Processor:** - Ensure that the `CustomProcessor.java` file has been created and that it implements the `org.apache.camel.Processor` interface. - The `process` method within `CustomProcessor` should perform the string replacement as required. 2. **Modify the SolrRouter:** - In the `SolrRouter.java` file, locate the route that starts with `from("direct:update.solr").routeId("FcrepoSolrUpdater")`. - Before the `.to("direct:send.to.solr")` step, add a line to process the exchange with the custom processor. This will look something like `.process(new CustomProcessor())`. Here's the modified section of the `SolrRouter.java` file with the custom processor added: ```java ... /* * Handle update operations */ from("direct:update.solr").routeId("FcrepoSolrUpdater") .log(LoggingLevel.INFO, logger, "Indexing Solr Object ${header.CamelFcrepoUri}") .setHeader(INDEXING_URI).simple("${header.CamelFcrepoUri}") // Don't index the transformation itself .filter().simple("${header.CamelIndexingTransformation} != ${header.CamelIndexingUri}") .choice() .when(header(INDEXING_TRANSFORMATION).isNotNull()) .log(LoggingLevel.INFO, logger, "Sending RDF for Transform with with XSLT from ${header.CamelIndexingTransformation}") .toD("xslt:${header.CamelIndexingTransformation}") // Add the custom processor here to replace ldp:contains with ldp:ccontains .process(new CustomProcessor()) .to("direct:send.to.solr") .when(or(header(INDEXING_TRANSFORMATION).isNull(), header(INDEXING_TRANSFORMATION).isEqualTo(""))) .log(LoggingLevel.INFO, logger,"No Transform supplied") // Add the custom processor here as well to ensure all paths are covered .process(new CustomProcessor()) .to("direct:send.to.solr") .otherwise() .log(LoggingLevel.INFO, logger, "Skipping ${header.CamelFcrepoUri}"); ... ``` Make sure to import the `CustomProcessor` class at the top of the `SolrRouter.java` file: ```java import org.fcrepo.camel.indexing.solr.CustomProcessor; ``` After making these changes, the SolrRouter will use the custom processor to replace `ldp:contains` with `ldp:ccontains` before sending the data to Solr. This should resolve the issue described in the ticket.
Changes on fcrepo-indexing-solr/src/main/java/org/fcrepo/camel/indexing/solr/CustomProcessor.java To solve the task for the `CustomProcessor.java` file, we need to create a new Java class that implements the `org.apache.camel.Processor` interface. This class will contain a `process` method that will be responsible for replacing all instances of the string literal `ldp:contains` with `ldp:ccontains`. Here's a practical proposal for the implementation: ```java package org.fcrepo.camel.indexing.solr; import org.apache.camel.Exchange; import org.apache.camel.Processor; import org.apache.camel.Message; public class CustomProcessor implements Processor { @Override public void process(Exchange exchange) throws Exception { // Retrieve the body of the message as a String final Message in = exchange.getIn(); final String body = in.getBody(String.class); // Replace all instances of 'ldp:contains' with 'ldp:ccontains' final String modifiedBody = body.replaceAll("ldp:contains", "ldp:ccontains"); // Set the modified body back into the message in.setBody(modifiedBody); } } ``` This class does the following: 1. It implements the `Processor` interface, which requires the `process` method. 2. Inside the `process` method, it retrieves the message body from the `Exchange` object. 3. It performs the string replacement using the `replaceAll` method on the body. 4. It sets the modified body back into the message. With this class in place, the `SolrRouter` can use an instance of `CustomProcessor` to process messages before they are sent to the `direct:send.to.solr` route.