Open Skunnyk opened 4 months ago
A trace when validate works (once in a while, random behavior), we can see that everything is fine on rdkafka side :thinking:
2024-04-24T10:09:24.856154Z DEBUG vector::app: Internal log rate limit configured. internal_log_rate_secs=10
2024-04-24T10:09:24.856210Z INFO vector::app: Log level is enabled. level="trace"
2024-04-24T10:09:24.856264Z DEBUG vector::app: messaged="Building runtime." worker_threads=4
2024-04-24T10:09:24.856354Z TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE
√ Loaded ["/etc/vector/vector.yaml"]
2024-04-24T10:09:24.860493Z DEBUG vector::topology::builder: Building new source. component=kafka
2024-04-24T10:09:24.861738Z TRACE source{component_kind="source" component_id=kafka component_type=kafka}: rdkafka::client: Create new librdkafka client 0x7f28d6141400
2024-04-24T10:09:24.861797Z TRACE rdkafka::consumer::stream_consumer: Starting stream consumer wake loop: 0x7f28d6141400
2024-04-24T10:09:24.861885Z TRACE source{component_kind="source" component_id=kafka component_type=kafka}: rdkafka::util: Destroying topic partition list: 0x7f28d63f0530
2024-04-24T10:09:24.861906Z TRACE source{component_kind="source" component_id=kafka component_type=kafka}: rdkafka::util: Destroyed topic partition list: 0x7f28d63f0530
2024-04-24T10:09:24.861938Z DEBUG vector::topology::builder: Building new sink. component=es
2024-04-24T10:09:24.869500Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}: vector_core::tls::settings: Fetching system root certs.
2024-04-24T10:09:24.877121Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}: vector_core::tls::settings: Fetching system root certs.
2024-04-24T10:09:24.883296Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: vector::internal_events::http_client: Sending HTTP request. uri=http://elasticsearch:9200/ method=GET version=HTTP/1.1 headers={"user-agent": "Vector/0.37.1 (x86_64-unknown-linux-gnu cb6635a 2024-04-09 13:45:06.561412437)", "accept-encoding": "identity"} body=[empty]
2024-04-24T10:09:24.883361Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::pool: checkout waiting for idle connection: ("http", elasticsearch:9200)
2024-04-24T10:09:24.883414Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::connect::http: Http::connect; scheme=Some("http"), host=Some("elasticsearch"), port=Some(Port(9200))
2024-04-24T10:09:24.884344Z DEBUG hyper::client::connect::dns: resolving host="elasticsearch"
2024-04-24T10:09:24.886658Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::connect::http: connecting to 10.20.0.100:9200
2024-04-24T10:09:24.886716Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: mio::poll: registering event source with poller: token=Token(139813365213824), interests=READABLE |
WRITABLE
2024-04-24T10:09:24.887021Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::connect::http: connected to 10.20.0.100:9200
2024-04-24T10:09:24.887047Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::conn: client handshake Http1
2024-04-24T10:09:24.887061Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::client: handshake complete, spawning background dispatcher task
2024-04-24T10:09:24.887128Z TRACE sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: hyper::client::pool: checkout dropped for ("http", elasticsearch:9200)
2024-04-24T10:09:24.887138Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Busy }
2024-04-24T10:09:24.887201Z TRACE encode_headers: hyper::proto::h1::role: Client::encode method=GET, body=None
2024-04-24T10:09:24.887244Z DEBUG hyper::proto::h1::io: flushed 185 bytes
2024-04-24T10:09:24.887257Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy }
2024-04-24T10:09:24.889167Z TRACE hyper::proto::h1::conn: Conn::read_head
2024-04-24T10:09:24.889189Z TRACE hyper::proto::h1::io: received 653 bytes
2024-04-24T10:09:24.889221Z TRACE parse_headers: hyper::proto::h1::role: Response.parse bytes=653
2024-04-24T10:09:24.889241Z TRACE parse_headers: hyper::proto::h1::role: Response.parse Complete(121)
2024-04-24T10:09:24.889258Z DEBUG hyper::proto::h1::io: parsed 3 headers
2024-04-24T10:09:24.889269Z DEBUG hyper::proto::h1::conn: incoming body is content-length (532 bytes)
2024-04-24T10:09:24.889296Z TRACE hyper::proto::h1::decode: decode; state=Length(532)
2024-04-24T10:09:24.889346Z DEBUG hyper::proto::h1::conn: incoming body completed
2024-04-24T10:09:24.889361Z TRACE hyper::proto::h1::conn: maybe_notify; read_from_io blocked
2024-04-24T10:09:24.889326Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}:http: vector::internal_events::http_client: HTTP response. status=200 OK version=HTTP/1.1 headers={"x-elastic-product": "Elasticsearch", "content-type": "application/json; charset=UTF-8", "content-length": "532"} body=[532 bytes]
2024-04-24T10:09:24.889382Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2024-04-24T10:09:24.890196Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2024-04-24T10:09:24.890212Z TRACE hyper::client::pool: pool dropped, dropping pooled (("http", elasticsearch:9200))
2024-04-24T10:09:24.890232Z TRACE hyper::proto::h1::dispatch: client tx closed
2024-04-24T10:09:24.890241Z TRACE hyper::proto::h1::conn: State::close_read()
2024-04-24T10:09:24.890249Z TRACE hyper::proto::h1::conn: State::close_write()
2024-04-24T10:09:24.890261Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Closed, writing: Closed, keep_alive: Disabled }
2024-04-24T10:09:24.890289Z TRACE hyper::proto::h1::conn: shut down IO complete
2024-04-24T10:09:24.890304Z TRACE mio::poll: deregistering event source from poller
2024-04-24T10:09:24.890353Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}: vector::sinks::elasticsearch::common: Auto-detected Elasticsearch API version. version=7
2024-04-24T10:09:24.895920Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}: vector_core::tls::settings: Fetching system root certs.
2024-04-24T10:09:24.901408Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch}: vector_core::tls::settings: Fetching system root certs.
2024-04-24T10:09:24.907324Z TRACE tower::buffer::worker: worker polling for next message
√ Component configuration
2024-04-24T10:09:24.907394Z TRACE vector::validate: Healthcheck for es starting.
2024-04-24T10:09:24.907435Z DEBUG http: vector::internal_events::http_client: Sending HTTP request. uri=http://elasticsearch:9200/_cluster/health method=GET version=HTTP/1.1 headers={"user-agent": "Vector/0.37.1 (x86_64-unknown-linux-gnu cb6635a 2024-04-09 13:45:06.561412437)", "accept-encoding": "identity"} body=[empty]
2024-04-24T10:09:24.907461Z TRACE http: hyper::client::pool: checkout waiting for idle connection: ("http", elasticsearch:9200)
2024-04-24T10:09:24.907522Z TRACE http: hyper::client::connect::http: Http::connect; scheme=Some("http"), host=Some("elasticsearch"), port=Some(Port(9200))
2024-04-24T10:09:24.907563Z DEBUG hyper::client::connect::dns: resolving host="elasticsearch"
2024-04-24T10:09:24.908195Z DEBUG http: hyper::client::connect::http: connecting to 10.20.0.100:9200
2024-04-24T10:09:24.908243Z TRACE http: mio::poll: registering event source with poller: token=Token(139813332916352), interests=READABLE | WRITABLE
2024-04-24T10:09:24.908561Z DEBUG http: hyper::client::connect::http: connected to 10.20.0.100:9200
2024-04-24T10:09:24.908581Z TRACE http: hyper::client::conn: client handshake Http1
2024-04-24T10:09:24.908593Z TRACE http: hyper::client::client: handshake complete, spawning background dispatcher task
2024-04-24T10:09:24.908606Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Busy }
2024-04-24T10:09:24.908615Z TRACE http: hyper::client::pool: checkout dropped for ("http", elasticsearch:9200)
2024-04-24T10:09:24.908630Z TRACE encode_headers: hyper::proto::h1::role: Client::encode method=GET, body=None
2024-04-24T10:09:24.908652Z DEBUG hyper::proto::h1::io: flushed 200 bytes
2024-04-24T10:09:24.908658Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy }
2024-04-24T10:09:24.927680Z TRACE hyper::proto::h1::conn: Conn::read_head
2024-04-24T10:09:24.927697Z TRACE hyper::proto::h1::io: received 510 bytes
2024-04-24T10:09:24.927706Z TRACE parse_headers: hyper::proto::h1::role: Response.parse bytes=510
2024-04-24T10:09:24.927714Z TRACE parse_headers: hyper::proto::h1::role: Response.parse Complete(121)
2024-04-24T10:09:24.927727Z DEBUG hyper::proto::h1::io: parsed 3 headers
2024-04-24T10:09:24.927732Z DEBUG hyper::proto::h1::conn: incoming body is content-length (389 bytes)
2024-04-24T10:09:24.927742Z TRACE hyper::proto::h1::decode: decode; state=Length(389)
2024-04-24T10:09:24.927747Z DEBUG hyper::proto::h1::conn: incoming body completed
2024-04-24T10:09:24.927752Z TRACE hyper::proto::h1::conn: maybe_notify; read_from_io blocked
2024-04-24T10:09:24.927759Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2024-04-24T10:09:24.927765Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2024-04-24T10:09:24.927776Z TRACE http: hyper::client::pool: put; add idle connection for ("http", elasticsearch:9200)
2024-04-24T10:09:24.927786Z DEBUG http: hyper::client::pool: pooling idle connection for ("http", elasticsearch:9200)
2024-04-24T10:09:24.927802Z DEBUG http: vector::internal_events::http_client: HTTP response. status=200 OK version=HTTP/1.1 headers={"x-elastic-product": "Elasticsearch", "content-type": "application/json; charset=UTF-8", "content-length": "389"} body=[389 bytes]
2024-04-24T10:09:24.927834Z INFO vector::topology::builder: Healthcheck passed.
2024-04-24T10:09:24.927846Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
√ Health check "es"
2024-04-24T10:09:24.927875Z TRACE vector::validate: Healthcheck for es done.
2024-04-24T10:09:24.927917Z TRACE rdkafka::util: Destroying queue: 0x7f28d62504c0
2024-04-24T10:09:24.927920Z TRACE tower::buffer::worker: worker polling for next message
2024-04-24T10:09:24.927930Z TRACE rdkafka::util: Destroyed queue: 0x7f28d62504c0
2024-04-24T10:09:24.927939Z TRACE tower::buffer::worker: buffer already closed
2024-04-24T10:09:24.927947Z TRACE rdkafka::consumer::base_consumer: Destroying consumer: 0x7f28d6141400
2024-04-24T10:09:24.928702Z TRACE hyper::client::pool: pool closed, canceling idle interval
2024-04-24T10:09:24.928720Z TRACE hyper::proto::h1::dispatch: client tx closed
2024-04-24T10:09:24.928725Z TRACE hyper::proto::h1::conn: State::close_read()
2024-04-24T10:09:24.928729Z TRACE hyper::proto::h1::conn: State::close_write()
2024-04-24T10:09:24.928734Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Closed, writing: Closed, keep_alive: Disabled }
2024-04-24T10:09:24.928764Z TRACE hyper::proto::h1::conn: shut down IO complete
2024-04-24T10:09:24.928772Z TRACE mio::poll: deregistering event source from poller
2024-04-24T10:09:24.964436Z TRACE rdkafka::consumer::base_consumer: Consumer destroyed: 0x7f28d6141400
2024-04-24T10:09:24.964460Z TRACE rdkafka::util: Destroying client: 0x7f28d6141400
2024-04-24T10:09:24.964984Z TRACE rdkafka::util: Destroyed client: 0x7f28d6141400
2024-04-24T10:09:24.965053Z TRACE rdkafka::consumer::stream_consumer: Shut down stream consumer wake loop: 0x7f28d6141400
------------------------------------
Validated
I'm a bit speechless. I can't really reproduce the problem on another setup. After investigation, I decided to add latency on the network interface, because intuition We have 0.2ms between vector and our source and sink (yes, good network)
Adding 15ms on vector:
tc qdisc add dev ens192 root netem delay 15ms
And, magic, vector validate
works all time. This is, wow. If I remove the delay, it's stuck again.
Can somebody with the problem try this workaround ? :pray:
I'm a bit speechless. I can't really reproduce the problem on another setup. After investigation, I decided to add latency on the network interface, because intuition We have 0.2ms between vector and our source and sink (yes, good network)
Adding 15ms on vector:
tc qdisc add dev ens192 root netem delay 15ms
And, magic,
vector validate
works all time. This is, wow. If I remove the delay, it's stuck again.Can somebody with the problem try this workaround ? 🙏
Huh, now that is is strange. Are Kafka and Elasticsearch already up and running when Vector starts up?
Yes they are up during validate, and during starts up.
vector validate
also test the sources, even if there is no output in the console about that.
Note: If I run vector
, everything is OK, it consumes kafka and write to ES.
A note for the community
Problem
Hi, first, thank you for your work on vector :)
I have a problem with vector 0.37.1 (debian package, but I can trigger the problem with 0.36 too), with
vector validate
getting stuck majority of the time. It seems to be related to akafka
source and aelasticsearch
sink. If I test other configurations, no problem so far. This may be related to discussions in issue #19333A minimal configuration which trigger the problem is attached.
Sometimes it works:
I can see an open kafka connection to port 9092 when the validate is stuck. The
offset=Invalid metadata=""
in DEBUG is strange. The kafka cluster works correctly.Thank you,
Configuration
Version
0.37.1
Debug Output
Example Data
No response
Additional Context
No response
References
19333