abashev / vfs-s3

Amazon S3 driver for Apache commons-vfs (Virtual File System) project
Apache License 2.0
93 stars 50 forks source link

big refactoring of concurrency #47

Closed svella closed 6 years ago

svella commented 6 years ago

This started out as a fix to a deadlock between getChildren() and getParent() but evolved into much more as I found serious issues with the locking being done by the modified commons-vfs for files systems other than S3. The main problem is that while AbstractFileObject was now using a Lock object for concurrency control, all of the derived classes were still using synchronized blocks on the filesystem, as a result of which it was possible in some cases for two threads to think they had exclusive access because they were using different mechanisms. Similarly, there were potential deadlocks because some execution paths would try to obtain both locks, but in a different order. The only way to resolve the issue was for everywhere to use either the synchronized blocks or the Lock object, and since there are third party file system providers that are out of our control even with a modified commons-vfs, the choice really had to be to use the synchronized blocks.

It also bothered me that we had to have the the modified vfs-commons so I also decided to approach it in a way that would allow vfs-s3 to work with either the modified version (via introspection) or the original, for cases where multi-threaded performance on the same file system was not particularly important. I also provided a setting that would allow per file locking to be turned off as an escape plan for any future deadlocks that might be found when doing per file locking.

And while I was at it I addressed a number of concurrency issues around the use of input and output streams.

We've been using these changes against the 2.4.x branch in production for over a year and they have been working well for us. I've been meaning to push these back to the main fork for a long time, but things have been crazy busy and am just now getting around to porting the the 3.0.0 branch and vfs-commons 2.2.

abashev commented 5 years ago

@svella It was so terrible idea ☹️

svella commented 5 years ago

@abashev ???

abashev commented 5 years ago

@svella The idea then we need to maintain compatibility with vanilla commons-vfs is so useless and hard to implement. So I decided to implement timeouts for locks because inside our environment with high-load whole server could be stuck only because of a broken network. And I have to rewrite locking from the scratch and rollback my changes with a little mix up of yours. And that is why it was a bad idea.