snowplow-incubator / snowplow-lake-loader

Snowplow Lake Loader
Other
0 stars 2 forks source link

Graceful shutdown stuck #65

Open manuelbernhardt opened 2 months ago

manuelbernhardt commented 2 months ago

We're deploying the snowplow-lake-loader using Google Cloud Run (for convenience). Sometimes the process receives a SIGTERM and proceeds to graceful shutdown. However, it looks like there's something off with the shutdown and in the end an exception is thrown (excerpt from the log below):

INFO com.snowplowanalytics.snowplow.lakes.Run - Received signal 15. Cancelling execution.
INFO com.snowplowanalytics.snowplow.lakes.processing.SparkUtils - Closing the global spark session...
java.lang.IllegalStateException: supervisor already shutdown

This leads to the container to be in a zombie state - the health probe is still correct, but the application hasn't exited and so nothing happens any longer. Is there perhaps a way to catch this exception should it occur?

istreeter commented 2 months ago

Hi @manuelbernhardt which output format are you writing to? i.e. Delta/Iceberg/Hudi. I'm asking because the shutdown process is slightly different depending on which spark libraries are being used. This might help me track down where the problem is coming from.

Also, which stream are you consuming from? Kinesis/pubsub/kafka?