nv-morpheus / Morpheus

Morpheus SDK
Apache License 2.0
336 stars 125 forks source link

[BUG]: Kafka topic not existing at startup causes pipeline to crash but not exit #1491

Open nvawood opened 7 months ago

nvawood commented 7 months ago

Version

23.07

Which installation method(s) does this occur on?

Kubernetes

Describe the bug.

When launching the new, modular digital fingerprintinting pipeline, if the Kafka topic does not yet exist the pipeline will emit an error, but not retry or exit. The pod must be deleted and relaunched after the topic exists to proceed.

E20230912 21:18:35.035174 106 kafka_source.cpp:525] Failed retrieve Kafka committed offsets. Received unexpected ErrorCode. Expected: RdKafka::ERR_NO_ERROR(0), Received: -185, Msg: Local: Timed out

Minimum reproducible example

The easiest way to reproduce is to build the DFP container, but edit the pipeline to remove the sleep()s and basic Kafka-topic-exists check in the main() loop, then deploy the Helm chart for DFP and tail the pod logs.

Relevant log output

Click here to see error details

 [Paste the error here, it will be hidden by default]

Full env printout

NGC DFP Helm chart

Other/Misc.

No response

Code of Conduct

jarmak-nv commented 7 months ago

Hi @nvawood!

Thanks for submitting this issue - our team has been notified and we'll get back to you as soon as we can! In the mean time, feel free to add any relevant information to this issue.