buckket / twtxt

Decentralised, minimalist microblogging service for hackers.
http://twtxt.readthedocs.org/en/stable/
MIT License
1.92k stars 79 forks source link

add_local_tweet is not atomic #34

Closed kitchen closed 8 years ago

kitchen commented 8 years ago

https://github.com/buckket/twtxt/blob/master/twtxt/file.py#L29-L36

This function is not atomic. Should writing to the file fail for any reason, the file may be corrupted and require manual cleanup.

It would be better to copy the existing file to a temporary file, write to the temporary file, and rename the temp file to the existing file. rename(2) is an atomic operation, so at worst, if writing the temporary file fails, you can easily abort, and the existing file is left unscathed.

ghost commented 8 years ago

i really don't see the issue with this, a copy is a somewhat expensive operation in comparison to the gain of such feature.

The worst outcome of an incomplete write would be something like that:

2016-02-04T13:30+01 You can really go crazy here! ┐(゚∀゚)┌
2016-02-01T11:00+01 This is just another example.
2015-12-12

I'm using the example from the readme here, as you see there was only half a date written, and the entry would probably be ignored by the client. Also IIRC the issue of half-writes would be pretty much non-existent since the default write buffer under Linux is somewhere in the vicinity of 8192 bytes, either all of that gets written or nothing, as far as the OS is concerned. 140+16 chars would be well below the write buffer.

The easier way to handle the issue of incomplete writes would be to check if the file ends on a newline before any operation on it, and if so: issue a warning to the user and remove the last line from the file.

buckket commented 8 years ago

Looked through the code and thought about benefits vs. complexity. Must say I’m with @TeddyDesTodes on this one.

timofurrer commented 8 years ago

Yeah, I'm fine with that, too.