ericvolp12 / jetstream

A simplified JSON event stream for AT Proto
MIT License
16 stars 2 forks source link

Create new cursor file if existing one is corrupt #2

Open juni-b-queer opened 1 month ago

juni-b-queer commented 1 month ago

I've had a number of times where, for some reason, my jetstream container stops abruptly without saving the cursor file properly. So when it restarts, the cursor file exists, but it's empty/corrupt, so jetstream throws an error when trying to start its subscription. This prevents the container from coming back online without manual intervention (removing the cursor file).

I'll likely implement this is my fork, but it would be helpful if

  1. Every so often, the current valid cursor file is backed up to a separate file
  2. If jetstream starts and the cursor file is corrupt, attempt to use the latest backup or create a new one. Don't block jetstream from starting

I'm going to try to implement this, but I have very very little go experience.

juni-b-queer commented 1 month ago

I've implemented this in my fork: https://github.com/juni-b-queer/jetstream

With these updates,

ericvolp12 commented 1 month ago

This is neat, I'm glad to see people getting some use out of Jetstream and am sorry you're running into these issues.

TBH I think I'd rather figure out why/how the cursor file is being corrupted when saving and fix that rather than add a cursor backup feature at the moment. If possible I'd want to keep the complexity of the service down and I think backups feel like they address the symptoms of a reliability problem and it'd be neat to fix the reliability problem at the root.

I'll dig into it a bit and see if I can find anything but if you've got any logs from an improper shutdown or anything like that, I'd be interested to see if the cursor manager logs a failure or how the program might be getting killed in such a way that it can't take the few milliseconds it needs to shutdown safely.

If it's being interrupted mid-write it makes me thing somehow we're not waiting properly for the program to exit or something is hard killing the process while it's in the process of shutting down.