beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.57k stars 1.8k forks source link

Atomic file copying operations #1788

Open nagisa opened 8 years ago

nagisa commented 8 years ago

It will not delete the destination file (or even try to do that) leaving a partially copied file behind. The fact that I/O error occured does not mean it will occur again deleting it.

sampsyo commented 8 years ago

Can you please give us a few more details? What exactly did you try to do, and what happened?

nagisa commented 8 years ago

What details are you interested in, exactly?

I was doing a beet mv path:./ into a NAS. As it happens, networks may become unreliable sometimes and occasional I/O errors during long-running operations occur. beets then just abandoned everything it was doing and simply quit (without trying to remove the file it was copying into). It should try doing copies in a, uh, more transaction-like way and at least attempt to remove the destination file if the copy failed.

nagisa commented 8 years ago

Ah, this is also important, because most audio formats do not break horribly when truncated. The common case is to point your music players to scan the whole directory of music recursively, and these players will eventually pick up those truncated audio files.

sampsyo commented 8 years ago

Thanks! That helped me understand.

Sounds like a good goal: file copies should be atomic, so when beets is done, they should either succeed completely or not at all. This of course isn't always possible (if we get entirely disconnected from a filesystem after starting a copy, for example), but we could do better.

untitaker commented 8 years ago

You ought to move/copy/whichever-operation to a temporary file on the target FS, and then mv. python-atomicwrites may or may not fit your usecase, I originally wrote it with just writing (not copying existing files) in mind.

EDIT: To explain this better, you probably only would want to use python-atomicwrites if you care about (atomically) asserting that the target file doesn't already exist. Otherwise os.rename(tmpfile, target_path) will work fine.

sampsyo commented 8 years ago

Cool! Your library looks great—it could be a good idea to use this for the conflict avoidance check we already do.

untitaker commented 8 years ago

As said, it pretty much depends on whether you also need to atomically check whether the target file exists, and that in a crossplatform way. Otherwise atomic file writes don't warrant their own library.