mwvgroup / Pitt-Google-Broker

A Google Cloud-based alert broker for LSST and ZTF
https://pitt-broker.readthedocs.io/en/latest/index.html
4 stars 0 forks source link

LVK alerts stopped publishing to the `lvk-alerts` topic in `ardent-cycling-243415` #238

Open hernandezc1 opened 5 days ago

hernandezc1 commented 5 days ago

A review of the lvk-alerts Pub/Sub topic’s metrics reveals that alerts stopped publishing unexpectedly to the topic after August 12th, at around 2:30pm. The lvk-alerts topic in the avid-heading-329016 project, however, continues to publish alerts to date (I was using this project to test the changes for PR #232). The Logs Explorer displays the following logs from the time-frame of 8/10/2024 through (8/13/2024):

Screenshot 2024-09-12 at 9 51 04 AM

The log on August 10th implies that the VM instance’s underlying hardware underwent maintenance, and was moved to another host as a result (see Live migration process during maintenance events for more information). However, Google’s documentation states that “live migration lets Google Cloud perform maintenance without interrupting a workload, rebooting a VM, or modifying any of the VM's properties, such as IP addresses, metadata, block storage data, application state, and network settings.” At the moment, it is not clear to me what caused alerts to stop being published.

Here is a visualization of the Pub/Sub metrics for lvk-alerts in ardent-cycling-243415:

Screenshot 2024-09-10 at 7 43 21 AM

I will continue to investigate this and update this issue as I discover more information.

hernandezc1 commented 5 days ago

After starting the VM and manually going through its startup script, I see that the consumer was able to subscribe to the topic successfully. After a few minutes, the consumer's nodes disconnect.

Screenshot 2024-09-12 at 10 44 58 AM

It seems that this issue is associated with the version of Confluent that the VM is using (v7.4), as other developers have seen similar log messages using the same version (see this Stack Overflow discussion)