shotgunsoftware / shotgunEvents

Flow Production Tracking event processing framework.
Other
127 stars 121 forks source link

Atomic writes of id file #46

Open pfranz opened 7 years ago

pfranz commented 7 years ago

We had an unexpected power failure and the id file was corrupted on disk. It looks like this has happened before to us. Also, we have a monitoring service that often loads an incomplete id file.

While it's not 100% guaranteed to be atomic, many tools write to a temp file and rename it in place. Some OSes guarantee atomicity and for the rest it should at least be a significant improvement.

herronelou commented 2 weeks ago

I'm pretty sure this has affected us too, the daemon seems to spend a non-insignificant amount of time between creating the file handle and finishing the pickling (https://github.com/shotgunsoftware/shotgunEvents/blob/master/src/shotgunEventDaemon.py#L605).

Every time we stop the daemon, it's like there's a 5% chance of hitting this spot and corrupting the file. I think it might be one of the root causes of #63