icculus / physfs

A portable, flexible file i/o abstraction.
https://icculus.org/physfs/
zlib License
559 stars 98 forks source link

Improve multithreading support #13

Open dnwerner opened 2 years ago

dnwerner commented 2 years ago

Last year Paradox developer Mathieu Ropert published a blog entry pointing to some potential enhancements in PhysFS regarding multithreading ( https://mropert.github.io/2020/07/26/threading_with_physfs/ ).

We are unaffiliated with Paradox or any Paradox products, but we encounter comparable performance problems in one of our projects when loading resources through PhysicsFS. Since modern APIs like Vulcan support multithreaded model-loading and since many machines now feature four, eight or even more cores and with ever-increasing SSD speeds, it is unfortunate that when using PhysicsFS, loading can effectively only be done on one thread at a time.

Is there any chance to see this bottleneck in PhysicsFS addressed in the future?

icculus commented 2 years ago

I've had that blog post open in a tab for months, but haven't had a chance to work on this yet. But yes, I'd like to resolve this issue and will try to find time to do so soon.

Underewarrr commented 2 years ago

waiting for it

icculus commented 2 years ago

Okay, so I looked at this, and reading from a PHYSFS_File does not hold the state lock as far as I can tell, which is where most of your time would be spent (either waiting on file i/o or decompressing). We've always said that it was safe to read from handles from any thread, as long as you don't read from the same handle on two threads at the same time, and this still seems to be the case: PHYSFS_read() and friends don't grab the big global lock when you call into them.

However, PHYSFS_enumerate() does, which is to say: if you've set up your threaded loader to look like...

void myloader_on_a_thread(const char *dir_to_read_from)
{
    PHYSFS_enumerate(dir_to_read_from, load_an_enumerated_file, NULL);
}

...each thread is going to block all the others until the entire directory is enumerated, because that does hold the state lock, and if the app's enumeration callback actually loads each file during enumeration, that will effectively serialize the whole effort. While this can be worked around by the app, what I'm describing is definitely a PhysicsFS deficiency.

Going to look further into this; we can very likely make the locking more fine-grained and solve this problem without changing the API or changing thread safety guarantees in unreasonable ways.

icculus commented 2 years ago

Confirmed with Mathieu that this is happening during enumeration, so I'll figure out how to fix that.