jcustenborder / kafka-connect-solr

Kafka Connect connector for writing to Solr.
Apache License 2.0
44 stars 28 forks source link

Standard Solr Mode: Failure in updating documents due to malformed update URL #51

Open the-srajan-jain opened 1 year ago

the-srajan-jain commented 1 year ago

I am facing in issue trying to run the connector in Standard Solr mode, where the ingestion of documents from the topic fails because the solr update URL that is being created is not correct.

With the below configuration of the connector -

{
  "name" : "httpSolrSinkConnector1",
  "config" : {
    "connector.class" : "com.github.jcustenborder.kafka.connect.solr.HttpSolrSinkConnector",
    "tasks.max" : "1",
    "topics" : "authors",
    "solr.url" : "http://indexer:8983/",
    "key.converter":"io.confluent.connect.avro.AvroConverter",
    "key.converter.schema.registry.url":"http://schema-registry:8081",
    "value.converter":"io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url":"http://schema-registry:8081"
  }
}

I get the following error trace

pipelinesetup-connect-1          | [2022-10-24 06:55:37,850] WARN Failed to parse error response from http://indexer:8983 due to: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format (org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient)
pipelinesetup-connect-1          | [2022-10-24 06:55:37,850] ERROR error (org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient)
pipelinesetup-connect-1          | org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://indexer:8983: Not Found
pipelinesetup-connect-1          | 
pipelinesetup-connect-1          | 
pipelinesetup-connect-1          | 
pipelinesetup-connect-1          | request: http://indexer:8983/update?wt=javabin&version=2
pipelinesetup-connect-1          |      at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:385)
pipelinesetup-connect-1          |      at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:183)
pipelinesetup-connect-1          |      at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
pipelinesetup-connect-1          |      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
pipelinesetup-connect-1          |      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
pipelinesetup-connect-1          |      at java.lang.Thread.run(Thread.java:748)

As you can see the Solr URL generated is http://indexer:8983/update but it should be http://indexer:8983/solr/authors/update

The connector works if I change the solr.url in the config to

"solr.url" : "http://indexer:8983/solr/authors",

This however will force the user to create one connector per topic.

To further make sense of the error stacktrace, especially Invalid version (expected 2, but 60) or the data in not in 'javabin' format

Which is mentioned in the README as being caused due to version mismatch, is being caused here because when we hit the malformed URL, we get

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/update. Reason:
<pre>    Not Found</pre></p>
</body>
</html>

Here the first < has the ASCII value 60, which probably explains the error.

I am using Solr version 8.2.0 and my SolrJ version is the same too.

didiez commented 1 year ago

see #53