Open gs-offcl opened 4 years ago
Did you solve this issue? Did it block implementation?
yes...its resolved.
How was it resolved?
The issue was hostname not getting resolved in start up script. I have used POD_NAME with env var instead and started working..
if [ ! -f $ZOO_DATA_DIR/myid ] ; then $(echo $((${HOSTNAME##*-}+1)) > $ZOO_DATA_DIR/myid ) else touch /conf/test; fi && \ $(echo $ZOO_SERVERS | sed \"s/$MY_POD_NAME.zkensemble/0.0.0.0/g\" > /conf/zooservers.txt) && \
.... env: -name: MY_POD_NAME valueFrom: fieldRef: fieldPath: metadata.name
@gs-offcl thanks for the info. Just a question, is there a typo in: "s/$MY_POD_NAME.zkensemble/0.0.0.0/g"
?
Given your comment I suppose the correct line should be "s/$POD_NAME.zkensemble/0.0.0.0/g"
without MY_
.
Right?
Have encountered following issues while I am trying to setup solrcloud and zookeeper cluster on kubernetes cluster (multi node),
Following are the scenarios experimented....
Scenario 1 - As is with public docker images (solr, zookeeper) on cluster
Steps:
Issues:
zookeeper.log
java.net.UnknownHostException: zk-2.zkensemble at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:607) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558) at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
**Solr logs: - Didn't throw up any errors and I could see solr is able to connect zookeper clsuter**
2020-04-28 09:49:17.022 INFO (main) [ ] o.a.s.c.SolrResourceLoader [null] Added 0 libs to classloader, from paths: [] 2020-04-28 09:49:17.255 INFO (main) [ ] o.a.s.h.c.HttpShardHandlerFactory Host whitelist initialized: WhitelistHostChecker [whitelistHosts=null, whitelistHostCheckingEnabled=true] 2020-04-28 09:49:17.553 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for SslContextFactory@ac20bb4[provider=null,keyStore=null,trustStore=null] 2020-04-28 09:49:17.742 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for SslContextFactory@63c12e52[provider=null,keyStore=null,trustStore=null] 2020-04-28 09:49:17.763 INFO (main) [ ] o.a.s.c.ZkContainer Zookeeper client=zk-0.zkensemble:2181,zk-1.zkensemble:2181,zk-2.zkensemble:2181 2020-04-28 09:49:17.825 INFO (zkConnectionManagerCallback-9-thread-1) [ ] o.a.s.c.c.ConnectionManager zkClient has connected 2020-04-28 09:49:20.075 INFO (main) [ ] o.a.s.c.OverseerElectionContext I am going to be the leader solr-0.solrcluster:8983_solr 2020-04-28 09:49:20.108 INFO (main) [ ] o.a.s.c.Overseer Overseer (id=145194837495316480-solr-0.solrcluster:8983_solr-n_0000000000) starting 2020-04-28 09:49:20.272 INFO (zkConnectionManagerCallback-16-thread-1) [ ] o.a.s.c.c.ConnectionManager zkClient has connected 2020-04-28 09:49:20.300 INFO (main) [ ] o.a.s.c.s.i.ZkClientClusterStateProvider Cluster at zk-0.zkensemble:2181,zk-1.zkensemble:2181,zk-2.zkensemble:2181 ready 2020-04-28 09:49:20.434 INFO (main) [ ] o.a.s.c.ZkController Register node as live in ZooKeeper:/live_nodes/solr-0.solrcluster:8983_solr 2020-04-28 09:49:20.440 INFO (OverseerStateUpdate-145194837495316480-solr-0.solrcluster:8983_solr-n_0000000000) [ ] o.a.s.c.Overseer Starting to work on the main queue : solr-0.solrcluster:
Scenario 2 - Have rebuilt docker images (solr, zookeeper) using RHEL as base OS and deployed on K8s cluster
Zookeeper logs :
2020-04-28 10:18:05,913 [myid:] - INFO [main:QuorumPeerConfig@136] - Reading configuration from: /conf/zoo.cfg 2020-04-28 10:18:05,944 [myid:] - WARN [main:QuorumPeer$QuorumServer@191] - Failed to resolve address: zk-20.0.0.0 java.net.UnknownHostException: zk-20.0.0.0: Name or service not known at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at java.net.InetAddress.getByName(InetAddress.java:1077) at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:181) at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.(QuorumPeer.java:153)
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:240)
java.net.UnknownHostException: zk-10.0.0.0: Name or service not known
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
Container logs:
/bin/sh: hostname: command not found
The above error seems to be coming while resolving hostname and replacing string during pod creation in statefulset..
Solr logs
Caused by: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper zk-0.zkensemble:2181,zk-1.zkensemble:2181,zk-2.zkensemble:2181 within 30000 ms at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:201)
at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:126)
at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:121)
at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:108)
at org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(SolrDispatchFilter.java:273)
... 50 more
Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper zk-0.zkensemble:2181,zk-1.zkensemble:2181,zk-2.zkensemble:2181 within 30000 ms
at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:250)
at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:193)
... 54 more
Looking forward for help.....and do let me know if you need any other details.