CSIRT-MU / Stream4Flow

A framework for the real-time network traffic analysis based on world-leading technologies for distributed stream processing, network traffic monitoring, and visualization.
https://csirt.muni.cz/?lang=en
MIT License
99 stars 36 forks source link

Elasticsearch Error #88

Closed zaxeqaz closed 5 years ago

zaxeqaz commented 5 years ago

I have been trying to set up a clustered Stream4Flow setup.

On the consumer I keep getting the following error: Error: Elasticsearch query exception: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fe24e5f69d0>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fe24e5f69d0>: Failed to establish a new connection: [Errno 111] Connection refused)

image

tomjirsa commented 5 years ago

Hi, it seems, that web cannot connect to the Elasticsearch DB. The issue might be the firewall settings - ports for Elasticsearch - 9200 need to be open. The other issue might be, that Elasticsearch service is down (check service elasticsearch status).

zaxeqaz commented 5 years ago

Ah, I see the application protocols_statistics never actually completed. The sparkMaster was unable to reach itself on port 7077. I don't see anything listening on that port. Am I missing something?

tomjirsa commented 5 years ago

Please check if SparkCluster is running - visit http://(sparkMaster IP address):8080/. Also check the firewall, if port 7077 is open - it might cause that spark did not started.

In case spark is down on sparkMaster, you need to

  1. kill all spark workers (can be done by killall java)
  2. start spark on sparkMaster <your spark location, usually /opt/spark>/sbin/start-master.sh
  3. start spark on sparkWorkers <spark location>/spark-bin/sbin/start-slave.sh <SPARK_MASTERURL - got when spark master started> -m <SPARK_WORKER_MEMORY>
zaxeqaz commented 5 years ago

Thanks for the respons tomjirsa. I tried to start the SparkWorker but it keeps failing with this error:

Spark Command: /usr/lib/jvm/java-8-oracle/jre/bin/java -cp /opt/spark/spark-bin/conf/:/opt/spark/spark-bin/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 192.168.66.73 -m 2048M

18/11/23 18:36:24 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.66.74 instead (on interface ens160) 18/11/23 18:36:24 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 18/11/23 18:36:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Exception in thread "main" org.apache.spark.SparkException: Invalid master URL: spark://192.168.66.73 at org.apache.spark.util.Utils$.extractHostPortFromSparkUrl(Utils.scala:2376) at org.apache.spark.rpc.RpcAddress$.fromSparkURL(RpcAddress.scala:47) at org.apache.spark.deploy.worker.Worker$$anonfun$13.apply(Worker.scala:716) at org.apache.spark.deploy.worker.Worker$$anonfun$13.apply(Worker.scala:716) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:716) at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696) at org.apache.spark.deploy.worker.Worker.main(Worker.scala)

tomjirsa commented 5 years ago

HI, I forgot to tell you that you have to set the following environmental variables on all spark machines before you run spark master/slave:

$ export SPARK_MASTER_HOST="<sparkMaster IP address>"
$ export SPARK_LOCAL_HOST="<IP address of the machine>"

This should solve the error above.

zaxeqaz commented 5 years ago

Hello,

Thank you very much for all the help. I reviewed the servers and half the processes were trying to run on the IPv6 address. So I disabled it and reinstalled everything. It seems like it is a lot better than before, but my Master still can't reach the producer. The processes on the producer are running on the loopback.

image

Thank you

cermmik commented 5 years ago

Hello, the Kafka and Zookeeper are installed in a common way and listens by default on all interfaces. It seems that your producer has some settings that affect it and I can not determine what it is. You can try to manually set interface in Zookeeper and Kafka configuration (/opt/kafka/config/). For example, you may set clientPortAddress in zookeeper.properties to set Zookeeper listening interface.

zaxeqaz commented 5 years ago

Hello cermmik,

I have mad those changes but my Master still is unable to connect or access the Producer. I checked the hosts file and it has the correct IP. I checked the Producer and the port is open.

image

Producer netstat: image

cermmik commented 5 years ago

Are you able to establish any type of connection between these two nodes? (Maybe there are some firewall restrictions in your cloud or hosts?)

zaxeqaz commented 5 years ago

The firewall was disabled. I'm still getting this warning.

image

cermmik commented 5 years ago

I looks that it needs to set:

$ export SPARK_MASTER_IP="<sparkMaster IP address>"
$ export SPARK_LOCAL_IP="<IP address of the machine>"

Also please check that your /etc/hosts contains address of the producer.

zaxeqaz commented 5 years ago

Hello cermmik,

Thank you for your help. This is the spark master trying to run the example application protocols_statistics. I exported the vars, checked the hosts file, and tried to run the script using the address instead of the hostname. I was able to connect to the port using telnet but am still getting the error when running the script.

Thank you

zaxeqaz commented 5 years ago

Would it be possible to get the specs of the test setup you guys are using?

cermmik commented 5 years ago

For a testing and development -- we are using simple Vagrant configuration as it is in the repository. In a production -- we are using virtual servers with clear Ubuntu 16.04.4 LTS.