AgNO3 / jcifs-ng

A cleaned-up and improved version of the jCIFS library
GNU Lesser General Public License v2.1
318 stars 103 forks source link

ConcurrentModificationException while SMBFile.listFiles() #118

Open rumburake opened 5 years ago

rumburake commented 5 years ago

I am running SMBFile.listFiles() on multiple hosts, in parallel threads (each Runnable has a separate SMBFile object). But I get this race condition. This only happens after migrating from JCIFS to JCIFS-NG. I was unable to get any clue if listFiles isn't thread safe anymire. Thanks!

java.util.ConcurrentModificationException
    at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
    at java.util.LinkedList$ListItr.next(LinkedList.java:888)
    at jcifs.smb.SmbTransportPoolImpl.findConnection(SmbTransportPoolImpl.java:118)
    at jcifs.smb.SmbTransportPoolImpl.getSmbTransport(SmbTransportPoolImpl.java:194)
    at jcifs.smb.SmbTransportPoolImpl.getSmbTransport(SmbTransportPoolImpl.java:47)
    at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:560)
    at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:484)
    at jcifs.smb.SmbEnumerationUtil.doShareEnum(SmbEnumerationUtil.java:149)
    at jcifs.smb.SmbEnumerationUtil.doEnum(SmbEnumerationUtil.java:222)
    at jcifs.smb.SmbEnumerationUtil.listFiles(SmbEnumerationUtil.java:283)
    at jcifs.smb.SmbFile.listFiles(SmbFile.java:1191)
    at com.x.x.x.ScanHostRunnable.run(ScanHostRunnable.java:31)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    at java.lang.Thread.run(Thread.java:764)
mbechler commented 5 years ago

Should be fixed in master. Missed one spot where synchronzation is required.

rumburake commented 5 years ago

Thank you! Will check again soon!

matheis commented 3 years ago

I'd like to get back to this one. We have a scenario where we store multiple files to a certain share in parallel with a rename at the end, where the written temp file gets renamed to a target file name. The same user account is used for all of the files. If they start to get a transport in parallel, there is still a gap in the latest code. The above fix syncs the lookup of the transport, but the retrieval of it should maybe added to the sync block as well. We end up in situations where we have 2 transports in the list (although the pooling should avoid that). Rarely the wrong one is retrieved and we see a rename problem, because the temp and target files are created within different transports/contexts.

Caused by: jcifs.smb.SmbException: Cannot rename between different trees at jcifs.smb.SmbFile.renameTo(SmbFile.java:1331) at jcifs.smb.SmbFile.renameTo(SmbFile.java:1303) at com.seeburger.SmbFile.renameTo(SmbFile.java:188) ... 14 more

mbechler commented 3 years ago

Sorry for the delay.

Just extending the synchronized block would essentially hold a global lock during all connection establishment and multi address failover - which likely will cause issues. To avoid the opportunity for multiple connections to one server would likely require a specialized mechanism to synchronize based on the target address, adding quite a bit of complexity. I will look into this.

On the other hand, your specific issue could also be avoided by making the check in renameTo less strict and only check the paths as a fallback.

matheis commented 3 years ago

Thanks for the reply, Moritz. Yeah, we also saw in tests, that the proposed change does not help. Reading your reply it seems that it is not strictly expected to have a single connection to a server although we use a single context? Because this is what we observe from time to time and assumed that this was not intended. E.g. because the renameTo strictly checks with "==". Ok, if it was possible to get this check less strict, this would be fine (for us :-)) and for the described problem during rename.

mbechler commented 3 years ago

Made the check less strict (only for the server and share name now), so this should no longer fail if different connections are used.

Yes, only a single connection should be used when the same context ist used and I will still look into fixing the concurrency issue we have there, but this looks a bit more difficult.