This replaces the FileOutputStream we were using with a RandomAccessFile and replaces the Apache Commons checksum function with our own. Doing so fixes two outstanding issues.
The first issue was that the FileOutputStream was not set to append mode, which caused the FileOutputStream to truncate its underlying file on instantiation. As a result, there may have been times where the file on disk was assumed to be valid when actually it was in a partially written state.
The second issue was that the checksumCRC32 function would read the file using a FileInputStream that the function instantiated itself. After calculating the checksum, that function would close the FileInputStream which would release the lock we acquired from the channel underlying our FileOutputStream. Since we would not have written our file by the time that lock was released, the opportunity arose for other processes to obtain the lock and write the same file at the same time. This would lead to files being left in bad states.
For good measure, I am also calling the force(...) method on the underlying channel to ensure that the content we write makes it to disk before we release our lock. Without that call, it seems like there was the potential for our content to be buffered but not written when we release the lock. It is difficult for me to say whether or not I saw this issue in practice since there were other outstanding issues that may have looked similar, but I figure that this is a better safe than sorry situation.
I should also note that many parts of the Java IO APIs are system-dependent. For example, that force(...) call may effectively be called when RandomAccessFile.close() is called on some systems, but it is not a guarantee for all systems.
This replaces the
FileOutputStream
we were using with aRandomAccessFile
and replaces the Apache Commons checksum function with our own. Doing so fixes two outstanding issues.The first issue was that the
FileOutputStream
was not set to append mode, which caused theFileOutputStream
to truncate its underlying file on instantiation. As a result, there may have been times where the file on disk was assumed to be valid when actually it was in a partially written state.The second issue was that the
checksumCRC32
function would read the file using aFileInputStream
that the function instantiated itself. After calculating the checksum, that function would close theFileInputStream
which would release the lock we acquired from the channel underlying ourFileOutputStream
. Since we would not have written our file by the time that lock was released, the opportunity arose for other processes to obtain the lock and write the same file at the same time. This would lead to files being left in bad states.For good measure, I am also calling the
force(...)
method on the underlying channel to ensure that the content we write makes it to disk before we release our lock. Without that call, it seems like there was the potential for our content to be buffered but not written when we release the lock. It is difficult for me to say whether or not I saw this issue in practice since there were other outstanding issues that may have looked similar, but I figure that this is a better safe than sorry situation.I should also note that many parts of the Java IO APIs are system-dependent. For example, that
force(...)
call may effectively be called whenRandomAccessFile.close()
is called on some systems, but it is not a guarantee for all systems.See also:
2033
2041