allegro / hermes

Fast and reliable message broker built on top of Kafka.
http://hermes.allegro.tech
Other
811 stars 210 forks source link

Timeout while posting http message #411

Closed hikrrish closed 8 years ago

hikrrish commented 8 years ago

I am getting async time out while posting a message to a topic

Exact error message : { "message": "Async timeout, cause: unknown", "code": "TIMEOUT" }

Headers Content-Length →60 Content-Type →application/json;charset=ISO-8859-1 Date →Thu, 17 Mar 2016 21:46:38 GMT Hermes-Message-Id →9ca198b8-29e5-4123-8781-34cc787bf188

Frontend server log

2016-03-17 21:46:38.453 ERROR p.a.t.h.f.publishing.HttpResponder - Async timeout, cause: unknown, publishing on topic com.krishna.cash, remote host 10.81.90.115, message state SENDING_TO_KAFKA_PRODUCER_QUEUE 2016-03-17 21:46:38.571 ERROR p.a.t.h.f.publishing.HttpResponder - Async timeout, cause: unknown, publishing on topic com.krishna.cash, remote host 10.81.90.115, message state SENDING_TO_KAFKA_PRODUCER_QUEUE

I have tested manually my kafka cluster and its healthy

As soon as I create a topic , I can see a directory is created in kafka as shown below and Hermes management is working fine

drwx------ 2 kafka kafka 4096 Mar 17 21:39 com.krishna.cash-1 drwx------ 2 kafka kafka 4096 Mar 17 21:39 com.krishna.cash-4 drwx------ 2 kafka kafka 4096 Mar 17 21:39 com.krishna.cash-7 drwx------ 2 kafka kafka 4096 Mar 17 21:37 com.krishna.cash-1 drwx------ 2 kafka kafka 4096 Mar 17 20:52 com.krishna.cash-4 drwx------ 2 kafka kafka 4096 Mar 17 20:52 com.krishna.cash-7

Note: my kafka-zookeeper and hermes zookeeper are separate clusters

adamdubiel commented 8 years ago

This means that you probably failed to transmit the whole message to Frontend in 65 milliseconds, if it was because of Kafka long response times you would receive 202 status (http://hermes-pubsub.readthedocs.org/en/latest/user/publishing/#response-codes). Are you sending it from local network to some AWS or sth?

hikrrish commented 8 years ago

Hi,

I am setting all components on AWS. I simply tried using the same zookeeper cluster with hermes that I have used with kafka and system started working. I think there is no such limitation of using same zookeeper cluster and probably the hermes zookeeper cluster has something wrong, though it worked with hermes management perfectly fine

adamdubiel commented 8 years ago

No, there is no such limitation. Our production cluster has separate Kafka and Zookeeper and works just fine - some more debugging might shed some light on what is happening there. It might be that Hermes Frontend could not connect to the Zookeeper and did not find topic definition.

hikrrish commented 8 years ago

ok, many thanks Can I specify multiple zookeeper instances in zookeeper.connect.string and kafka.zookeeper.connect.string e.g.

zookeeper.connect.string=ec1:2181,ec2:2181,ec3:2181 and similarly for kafka zookeeper the reason why I am asking is document does not specify that a list is supported and zookeeper is not supporting a load balancer so If use a zk cluster I need to list them all

hikrrish commented 8 years ago

One more issue .. while publishing message it always gives 202 and not 201 created .. no errors. Kafka /Zookeeper clusters are manually tested and are good no errors are popping

adamdubiel commented 8 years ago

Ad 1) We used ZK naming for this property (connection string) to specify that this is in Zookeeper format, so it can include a list of hosts with ports and even with prefixes (aka zk1:2181/my/prefix).

Ad 2) The 202 is usually nothing bad unless you get it for all your messages. If so, check if the connection to Kafka is really working and if when subscribing you receive the messages you posted.

hikrrish commented 8 years ago

Hi Adam,

Here is the exception printed in kafka, 10.81.87.248 is my hermes frontend.Size of packet is received from hermes is negative and discarded by kafka ..

Kafka Version :kafka_2.11-0.9.0.1.tgz Hermes : hermes-frontend-0.8.5-hotfix2.zip

[2016-03-21 01:13:56,447] WARN Unexpected error from /10.81.87.248; closing connection (org.apache.kafka.common.network.Selector) org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = -720899) at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:89) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134) at org.apache.kafka.common.network.Selector.poll(Selector.java:286) at kafka.network.Processor.run(SocketServer.scala:413) at java.lang.Thread.run(Thread.java:745)

hikrrish commented 8 years ago

Tried with kafka 0.8.2.2 Still facing same issue, getting 202 while posting message

adamdubiel commented 8 years ago

Do you also see the same output in Kafka logs? What is your Kafka configuration in Hermes?

hikrrish commented 8 years ago

Here is the configuration for frontend and consumers, created a single file and using

export HERMES_CONSUMERS_OPTS="-Darchaius.configurationSource.additionalUrls=file:///opt/hermes/conf/hermes-frontend.properties" export HERMES_FRONTEND_OPTS="-Darchaius.configurationSource.additionalUrls=file:///opt/hermes/conf/hermes-frontend.properties"

to run them

zookeeper.connect.string=zk1:2181,zk2:2181,zk3:2181 zookeeper.connection.timeout=10000 zookeeper.max.retries=20 zookeeper.base.sleep.time=1000 zookeeper.root.storage.pathPrefix=/hermes zookeeper.cache.thread.pool.size=5 zookeeper.authorization.enabled=false zookeeper.authorization.password=password kafka.broker.list=kf1:9092,kf2:9092,kf3:9092 kafka.zookeeper.connect.string=kfz1:2181,kfz2:2181,kfz3:2181

For management created yml file as below

kafka: clusters:

  clusterName: 
  connectionString: kfz1:2181,kfz2:2181,kfz3:2181

storage: connectionString: zk1:2181,zk2:2181,zk3:2181 connectionTimeout: 3000

export HERMES_MANAGEMENT_OPTS="-Dspring.config.location=/opt/hermes/conf/hermes-mgmt.yml"

Separate IP's for hermes Metadata ZK and Kafka Metadata ZK

Can you please check if anything wrong?

hikrrish commented 8 years ago

I don't see any warn/ error logs in hermes applications.

adamdubiel commented 8 years ago

Can you paste Frontend logs from the startup somewhere? (e.g. http://pastebin.com/) I see nothing wrong with the configuration. I think there must be some issue with Kafka connectivity and those messages spin in the buffer.

gamefundas commented 8 years ago

Somewhat related to #410. A configuration issue for the most part on our side with ZK and Kafka. This is resolved. The issue can be closed.

hikrrish commented 8 years ago

Please close this issue we have complete setup running on AWS now

adamdubiel commented 8 years ago

Thanks! Glad to hear everything is up and running :)