jaegertracing / spark-dependencies

Spark job for dependency links
http://jaegertracing.io/
Apache License 2.0
125 stars 70 forks source link

Can not connect to the es node, only connect to the bridge ip address #40

Closed lxkaka closed 5 years ago

lxkaka commented 6 years ago

execute command: docker run --rm --name spark-dependencies --env STORAGE=elasticsearch --env ES_NODES=http://172.31.60.138:9200 jaegertracing/spark-dependencies error log shows:

ERROR NetworkClient: Node [172.18.0.1:9200] failed (Connection refused (Connection refused)); no other nodes left - aborting...
Exception in thread "main" org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[172.18.0.1:9200]]

Is that a bug?

pavolloffay commented 6 years ago

span-dependencies container probably cannot see 172.18.0.1:9200. Is ES running as docker container on your host?

lxkaka commented 6 years ago

@pavolloffay ES run not in the container and the ip of es is 172.31.60.138

pavolloffay commented 6 years ago

172.18.0.1:9200 is in logs from spark job. Is it single or multi node ES?

lxkaka commented 6 years ago

@pavolloffay ES is running as a single node. ip172.18.0.1 is actually the bridge ip address.

br-24285fa93a58 Link encap:以太网  硬件地址 02:42:ef:0c:77:6e
          inet 地址:172.18.0.1  广播:172.18.255.255  掩码:255.255.0.0
          inet6 地址: fe80::42:efff:fe0c:776e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  跃点数:1
          接收数据包:1266639 错误:0 丢弃:0 过载:0 帧数:0
          发送数据包:1276180 错误:0 丢弃:0 过载:0 载波:0
          碰撞:0 发送队列长度:0
          接收字节:1208248573 (1.2 GB)  发送字节:828010734 (828.0 MB)
pavolloffay commented 6 years ago

It seems like a networking/configuration issue in your environment.

prabhakm commented 5 years ago

I ran into this problem.

Running ES 6.5.4 as non-docker in my local and docker run --env STORAGE=elasticsearch --env ES_NODES=http://localhost:9200 jaegertracing/spark-dependencies gives error

19/01/29 20:28:39 ERROR NetworkClient: Node [http://127.0.0.1:9200] failed (Connection refused (Connection refused)); no other nodes left - aborting...
Exception in thread "main" org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[http://127.0.0.1:9200]]
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:150)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:466)
    at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:474)
    at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:570)
    at org.elasticsearch.hadoop.rest.RestClient.indexExists(RestClient.java:565)
    at org.elasticsearch.hadoop.rest.InitializationUtils.checkIndexStatus(InitializationUtils.java:73)
    at org.elasticsearch.hadoop.rest.InitializationUtils.validateSettingsForReading(InitializationUtils.java:271)
    at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:218)
    at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute(AbstractEsRDD.scala:73)
    at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions(AbstractEsRDD.scala:72)
prabhakm commented 5 years ago

Setting network as docker run --env STORAGE=elasticsearch --network=host --env ES_NODES=http://localhost:9200 jaegertracing/spark-dependencies help pass the above issue but got into spark issue now ->

19/01/29 20:57:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:397)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:178)
    at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:168)
    at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:48)
    at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:38)
Caused by: java.net.UnknownHostException: moby: moby: Name does not resolve
    at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
prabhakm commented 5 years ago

@yurishkuro - I was able to make this work. can I submit a help doc?

yurishkuro commented 5 years ago

of course, please do

pavolloffay commented 5 years ago

@prabhakm what was the issue? Could you please post the fix here?

prabhakm commented 5 years ago

Sure. Will shortly share the details

brockoffdev commented 5 years ago

@prabhakm could you provide information on how you solved this? running into something similar in k8s

croissong commented 5 years ago

@prabhakm I'm also running into this issue on kubernetes. Can you maybe just give a quick hint on how you solved it?

pavolloffay commented 5 years ago

In case @prabhakm or @brockoffdev does not respond try setting these two properties:

* `ES_CLIENT_NODE_ONLY`: Set to true to disable elasticsearch cluster nodes.discovery and enable nodes.client.only. 
                         If your elasticsearch cluster's data nodes only listen on loopback ip, set this to true. 
                         Defaults to false
* `ES_NODES_WAN_ONLY`: Set to true to only use the values set in ES_HOSTS, for example if your
                       elasticsearch cluster is in Docker. If you're using a cloudprovider
                       such as AWS Elasticsearch, set this to true. Defaults to false

And comment back. I will document this somewhere.

croissong commented 5 years ago

@pavolloffay that did the trick :)

ES_CLIENT_NODE_ONLY:  false
ES_NODES_WAN_ONLY:    true

solved it for us.

pavolloffay commented 5 years ago

@Croissong could you please describe your ES deployment? How many nodes, to which node are you connecting?

pavolloffay commented 5 years ago

This should be solved by https://github.com/jaegertracing/spark-dependencies/pull/79

croissong commented 5 years ago

@pavolloffay Single node ES, deployed outside of the cluster and ES_NODES is set to a Azure network internal IP.

pavolloffay commented 5 years ago

thanks

I have done some changes, the latest image should with it should be published. You can try it out.

pavolloffay commented 5 years ago

It basically sets wan only if the ES_NODES are specified

arj22 commented 4 years ago

@pavolloffay Using https://ELB_URL for Amazon ES is attempting connections on port 9200 instead of 443 and failing. I had to use https://ELB_URL:443

Note that jaeger collector uses https://ELB_URL alone

pavolloffay commented 4 years ago

Yes we know this behaviour, it's always better to specify the ports.

pavolloffay commented 4 years ago

From readme in this repository

* `ES_NODES`: A comma separated list of elasticsearch hosts advertising http. Defaults to
              localhost. Add port section if not listening on port 9200. Only one of these hosts
              needs to be available to fetch the remaining nodes in the cluster. It is
              recommended to set this to all the master nodes of the cluster. Use url format for
              SSL. For example, "https://yourhost:8888"