Closed riklopfer closed 3 months ago
async ioloops and threads are not fork-safe. You may be able to get things running by using spawn or spawnserver start methods for your processes. fsspec tries to detect that it finds itself in a forked process, but this cannot be guaranteed.
Other possible workarounds:
Thanks @martindurant. As long as we use ThreadPoolExecutor
we are able to query the file system object in the main thread and don't need to worry about deadlocks?
I don't expect any problems with threads, but note that s3fs is async and runs its ioloop in a dedicated thread, so you probably don't get any parallelism on the IO by doing this. You might still get parallelism on whatever you then do with what you fetched.
Multiprocessing documentation that I evidently missed.
The example below hangs when using s3fs with ProcessPoolExecutor but not when using ThreadPoolExecutor. Both work with local file system.
Works
Hangs
Version info