apache / solr-operator

Official Kubernetes operator for Apache Solr
https://solr.apache.org/operator
Apache License 2.0
246 stars 111 forks source link

Indexing using external Zookeeper - SolrJ #528

Closed panther999 closed 1 year ago

panther999 commented 1 year ago

Hello,

We are planning to move away from our old solr cluster - and we found this tool. Using Solr-Operator, we were able to deploy solrcluster on aws EKS cluster with external zookeepers. We can see the state of zookeeper on Solr Admin panel. We are able to do other operations as well ( like creating collections/aliases etc ). This is very lightweight and fast to deploy - I really appreciate this project and effort.

However we are facing a problem during indexing. Our indexing process is a java batch job which uses SolrJ to connect to Zookeeper and pushes documents. While trying to push documents - SolrJ throws below error :

Request to collection [search1] failed due to (0) java.net.UnknownHostException: No such host is known (core-solrcloud-0.core-solrcloud-headless.default)

Essentially - it seems zookeeper is returning the host name - which are not valid for itself - which is causing the trouble.

Can you suggest anything ?

janhoy commented 1 year ago

When you are in Kubernetes, your SolrJ won't need to talk to ZK at all, since the URL for Solr is well defined as the Service URL. So try to use the other constructor on CloudSolrClient which takes a URL, and insert the stable service URL there. (Assuming your SolrJ client is also inside your Kubernetes cluster).

panther999 commented 1 year ago

I just tried it - Instead of giving zookeeper I passed service url - but it seems it is internally trying to connect through hostname and throwing below error.

2023-03-06 20:29:47,080 Attempt to fetch cluster state from http://core-solrcloud-0.core-solrcloud-headless.default:8983/solr failed.

HoustonPutman commented 1 year ago

So there are three possible solutions here.

To be clear, you are using an ingress to make at least one endpoint available outside of your EKS cluster, correct?

panther999 commented 1 year ago

Thank you both. Combining both the solutions - worked for me. I used solr-common-service baseurl with Http2Client. Indexing runs fine now from outside of kubernetes cluster.