Closed chetankashetti closed 1 month ago
Thank you for the report, sorry for the delay in response. Shuttle has seen multiple improvements related to issues such as this since this was opened. If you're still seeing this on the latest version of shuttle, feel free to open a new ticket with the latest evidence + details you have, as it is likely a different issue at this point.
Thank you!
What is the bug? Shuttle service was stuck for a day without any error logs or exceptions.
How can it be reproduced? We have 3 shards running for live subscription. out of them two were stuck, shard-0 and shard-2.
We observed we are no more receiving the data from shuttle, and when we saw the logs there were no error logs. some of the metrics we looked at was hubs (cpu and memory) and service(cpu and memory) and RDS all look totally fine. in fact underutilised. some of the screenshots indicating no interaction and kept hanging state for a while not sure if even connection was still there.
While it was stuck for a day, first action we did was to restart the pod. when we did that it started syncing from the eventId it was stuck. it took few hours to sync. but once it was live, observed that the cast i made an hour back didn't get indexed, ideally it should have indexed? because live stream holds data for 3 days. and it missed my cast, similarly might have missed others as well.
So, just to summarise we wanted to know couple of things
we are not able to reproduce the issue, but we have observed only once. Additional context