docker-solr / docker-solr-examples

Examples for Docker-Solr
Apache License 2.0
61 stars 58 forks source link

Update example to demonstrate using Hostname? #5

Open epugh opened 4 years ago

epugh commented 4 years ago

I tried out the demo at https://github.com/docker-solr/docker-solr-examples/blob/master/docker-compose/docker-compose.yml hoping it would help me with my SOLR_HOST difficulties.

The issue with Docker is that the hosts are all internal to the network, not externally accessible. Which means features like the JDBC access to Solr fail, because the Solr client can't access those internal addresses from an external location.

I was hoping this example could demonstrate how to use a SOLR_HOST property to use the externally addressable address.

epugh commented 4 years ago

Screenshot at Mar 28 08-22-31

epugh commented 4 years ago

I confirmed that if you set the SOLR_HOST, while the server starts up fine:

  solr1:
    image: solr:8.4
    container_name: solr1
    ports:
     - "8981:8981"
    environment:
      - ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
      - SOLR_PORT=8981
      - SOLR_HOST=192.241.154.248
    networks:
      - solr
    depends_on:
      - zoo1
      - zoo2
      - zoo3

You get this stack trace when you try to create a collection, or otherwise communicate around the cluster:

Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to 192.241.154.248:8983 [/192.241.154.248] failed: connect timed out
solr3    |  at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) ~[?:?]
solr3    |  at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[?:?]
solr3    |  at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:394) ~[?:?]
solr3    |  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[?:?]
solr3    |  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[?:?]
solr3    |  at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[?:?]
solr3    |  at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[?:?]
solr3    |  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[?:?]
epugh commented 4 years ago

If I jump on on of the Solrs, I can do curl http://www.google.com but I attempting to do curl against the SOLR_HOST fails:

solr@fbe655d4d522:/opt/solr-8.4.1$ echo $SOLR_HOST
192.241.154.248
solr@fbe655d4d522:/opt/solr-8.4.1$ curl http://192.241.154.248. <--  times out
epugh commented 4 years ago

Much messing around with network_mode:host and I could get a version where internally the container could use the $SOLR_HOST, but then couldn't reach it externally! Love some ideas.

makuk66 commented 4 years ago

The issue with Docker is that the hosts are all internal to the network, not externally accessible. Which means features like the JDBC access to Solr fail, because the Solr client can't access those internal addresses from an external location.

That's what the ports are for. For example, this line https://github.com/docker-solr/docker-solr-examples/blob/master/docker-compose/docker-compose.yml#L16 has:

    ports:
     - "8981:8983"

which opens port 8981 on the host, to forward to port 8983 on the container named "solr1".

So, if you want to run your JDBC tool on the host, just tell it to talk to Solr on http://ip-of-host:8981/.

Or am I missing some other requirements of your JDBC access?

epugh commented 4 years ago

Thanks @makuk66 for weighing in. Your are correctly right on the "run your JDBC tool" instruction! However, what I am finding is that with the Docker setup, if I provide a SOLR_HOST=ip-of-host then the Solr container can't run any HTTP commands that access the ip-of-host address! For example, if you try to create a collection, Solr tries to make a http://ip-of-host:8981/solr/admin/collections/create?name=mycollection, and due to networking, it can't make the call. If you ssh into the Solr node and try to do the same command: curl http://ip-of-host:8981/solr/admin/collections/create?name=mycollection you get a timeout.

However, you can do it externally!

What I can't quite figure out is how to make the ip-of-host be an accessible ip from the external to the Host perspective AND internal to the host! I can get one or the other.

I've thought about trying to modify the CloudSolrClient class to have some sort of mapping file for "internal_ip => external_ip" but that would be a big change. Much rather just have one set of IP's that work inside Docker host and externally!

Is that clearer?

makuk66 commented 4 years ago

if I provide a SOLR_HOST=ip-of-host

If I recall correctly, the SOLR_HOST variable is used to determine the address of the node when it registers with ZooKeeper; in which case changing that seems unwise.

If you ssh into the Solr node and try to do the same command: curl http://ip-of-host:8981/solr/admin/collections/create?name=mycollection you get a timeout.

Are you running on MacOS or Linux? Some details differ. On MacOS you can do:

docker run --name epugh1 -d -p 8988:8983 solr
docker exec -it epugh1 curl http://host.docker.internal:8988/solr/

I'm still not clear what you are actually trying to do. You want to use JDBC -- where is your DB running; on the host or in a container or somewhere else? And how are you using JDBC? Are you wanting to push into Solr from the host (in which case, use the forwarded ports), or do you want to push into Solr from the containers (in which case, use the container name solr1 and port 8983), or do you want to have Solr pull from the DB (in which case see https://github.com/docker-solr/docker-solr-examples/tree/master/dih-postgres)? Or are you using a solrcloud-aware client that talks to ZK and individual Solr servers, in which case run that in a container.

epugh commented 4 years ago

Okay, I think when I say "JDBC" I'm confusing things, my apologies!

So, I am trying to use the Solr JDBC driver (https://lucene.apache.org/solr/guide/8_4/solr-jdbc-apache-zeppelin.html) in Zeppelin with my Solr Cloud setup. I should have said "solrcloud-aware" client ;-)

I figured out how to make the DNS name the same from both inside the container and from outside:

  solr8:
    image: harbor.dev.o19s.com/quepid-docker/solr8
    ports:
      - "8985:8985"
    environment:
      - "SOLR_PORT=8985"
      - "SOLR_HOST=quepid-solr.dev.o19s.com"
    extra_hosts:
      - "quepid-solr.dev.o19s.com:127.0.0.1"

This is some hackery, as when the container tries to access quepid-solr.dev.o19s.com it gets resolved to 127.0.0.1, but when my solrcloud-aware client tries to access quepid-solr.dev.o19s.com it gets back the host DNS of 192.241.154.248.

Not sure if this is something more generalizable...

zhangddjs commented 4 years ago

so smart!

zhangddjs commented 4 years ago

hello, when I used this config and adding a collection, it appeared time out exception because of connection to hostname:other_node_port failure. OR connect refused then I configured as follows and it success:

solr4:
    image: solr:7.5
    container_name: solr4
    ports:
     - "30008:30008"
    environment:
      - ZK_HOST=solr-zk1:2181,solr-zk2:2181,solr-zk3:2181
      - SOLR_PORT=30008
      - SOLR_HOST=dev.covfefe.com
    extra_hosts:
      - dev.covfefe.com:10.100.13.173
    networks:
      - solr-cloud
    depends_on:
      - solr-zk1
      - solr-zk2
      - solr-zk3

image

and now I can connect the solrCloud with solrJ. Thanks.