icculus / physfs

A portable, flexible file i/o abstraction.
https://icculus.org/physfs/
zlib License
559 stars 98 forks source link

Multithreaded efficiency #63

Closed S41L0R closed 1 year ago

S41L0R commented 1 year ago

Currently I'm having some issues with efficiently calling physfs from multiple threads. For some context here's what I've got going on:

You can assume that all of this is running on different threads. The problem I ran into is that my massive enumeration will block the other two. Do you have any ideas for this issue? Here are the three solutions I've come up with, but they all have caveats:

  1. Implement shared mutexes, where a unique lock is only used in functions that modify state. Cons: This only alleviates the issue for concurrent function calls that do not modify state, and would require changes to the native implementations.
  2. Make a copy of needed state at the beginning of expensive functions, so the mutex only needs to be locked during cloning. Cons: cloning the opaque pointers, and cloning state would have to be carefully considered per each function it's implemented on so it doesn't cause unexpected outcomes.
  3. Instantiating physfs contexts, as it appears you have planned for V4. Cons: the user has to manage the state of contexts across threads.

I didn't rule out option 1 because although it doesn't fully solve the problem on its own it could be a minor optimization to employ alongside another solution. Honestly it might make sense to do all three, as option two would help with performance for those who don't choose to maintain multiple contexts. All in all, 3 seems the most promising, so I'm glad that you have it planned.

S41L0R commented 1 year ago

Oh I just realized this is covered in #13. My bad, I must have misspelled my search. Anyway hopefully this can connect as some more brainstorming or something.

icculus commented 1 year ago

There are plans to fix this, but for now, the fastest workaround is often to not use enumeration callbacks; use PHYSFS_enumerateFiles, then iterate the returned list yourself, since the lock will only be held while building the list.

S41L0R commented 1 year ago

I'll try that, thanks. Although most of the problem comes from the sheer number of files that need to be iterated, so I'm a little doubtful. In the coming days I may separate things out into contexts and open a PR, since that's desirable both for me and for your V4 plans.

S41L0R commented 1 year ago

I assume you must have some way to test compilation for all options and platforms... I'm slowly wrapping up my contexts implementation, but as simple as the changes are, it's likely that there are issues. There's just a ton of code I had to modify for this, and I can only test the Linux platform for compilation.

I still have work to do, so the PR won't be immediately. But it might be soon!

S41L0R commented 1 year ago

Oh and for some info on what my impl looks like:

icculus commented 1 year ago

Wait, just to be clear: there's zero chance I will merge a PR that makes dramatic changes like this.

My experience has been letting external contributors make sweeping changes causes problems down the line, and I'm in crunch on a different project, and not ready to seriously start on PhysicsFS 4.0 yet.

S41L0R commented 1 year ago

Forget those messages I've just deleted. I've got a much better solution that will change much less:

It will instead only create four more PHYSFS functions:

The difference will be that the PHYSFS_Context will not have to be passed around all of PHYSFS. Instead, the PHYSFS functions implemented in physfs.c will call some function like getContextForCurrentThread() when they need that info. Changes should be isolated because of this.

What do ya think?

S41L0R commented 1 year ago

Well I went ahead and did the PR for my second implementation (#65). Please look through at your leisure! I'll close this issue and we can discuss there when you get the chance.