NVIDIAGameWorks / PhysX

NVIDIA PhysX SDK
Other
3.15k stars 794 forks source link

How to use Multithreading? #453

Open morphogencc opened 3 years ago

morphogencc commented 3 years ago

What is the correct way to get PhysX to take advantage of multiple threads?

I've tried using the default CPU Dispatcher:

    mDispatcher = physx::PxDefaultCpuDispatcherCreate(mNumberOfThreads);

    physx::PxSceneDesc sceneDesc(mPhysics->getTolerancesScale());
    sceneDesc.cpuDispatcher = mDispatcher;
    sceneDesc.filterShader = physx::PxDefaultSimulationFilterShader;

    mScene = mPhysics->createScene(sceneDesc);

And I compared performance by setting mNumberOfThreads to 1, 8, and 16 -- with no performance difference at all. So it seems like setting the number of threads isn't enough. Does PhysX natively optimize for multiple threads, or does pretty much all multithreading have to be a result of me manually creating PxTasks? And is there a simple example of how this works? I found the Submarine sample uses PxTask, but it's a bit dense -- it'd be great if there was something simpler. I have a number of overlap queries running in my simulation that I'd love to put on separate threads if I can!

PierreTerdiman commented 3 years ago

Does PhysX natively optimize for multiple threads,

Yes, but only for the simulation itself. For scene queries you have to do it yourself. So if you only have overlap queries, and you only call e.g. PxScene::overlap(), then it is normal that the above code won't help.

I have a number of overlap queries running in my simulation that I'd love to put on separate threads if I can!

You can. Just do so. Call PxScene::overlap() from separate threads. But you have to manage these threads yourself. You can call these scene queries while PxScene::simulate() is running, but not during PxScene::fetchResults().

You could potentially use PxTask for this I think, but you don't have to - just create N threads, run your overlaps there, done.

morphogencc commented 3 years ago

@PierreTerdiman Thanks!

In my current simulation I have ~50-60 objects being simulated (gravity + interactive forces) but don't see any performance increase when I change my CPU Dispatcher from 1 thread to 16 threads -- does this still seem normal? I don't have a sense if this is "a lot" of objects, or if this is so few that the number of threads likely don't matter.

PierreTerdiman commented 3 years ago

Sounds like it's just too few objects, especially if they're simple one-actor-with-one-shape objects.

Multithreading has some overhead to manage threads etc, so with small scenes it's actually quite possible (and normal) that a single-threaded setup can actually run faster.

Performance will also ultimately depend on what's happening in your scene. If you have 60 objects not colliding with each other and resting on a plane for example, they're probably all "sleeping" and the SDK has no work to do anyway. But if they're all moving and colliding in a pile, there's more contact generation to deal with and multiple threads might start to help here. But even in this case 60 objects is probably not enough.

Try with 600 or even 6000 objects instead of 60, and you should see benefits from the multi-threaded setup...

PierreTerdiman commented 3 years ago

I don't have a sense if this is "a lot" of objects

It's not. Again, ultimately how many objects you can deal with depends on how complex each object is, but if we're talking about simple rigid bodies here's an old video of mine that could give you a better idea of how many objects PhysX can deal with: https://www.youtube.com/watch?v=6dATi4-wb3o

And if we're talking GPU rigid bodies the limit is a lot higher. See this video from 2018 (we could probably do more today): https://www.youtube.com/watch?v=TAlqJlIDef8

morphogencc commented 3 years ago

@PierreTerdiman thanks so much! It looks like the physics simulation isn't actually the bottleneck in my app, but this was all very helpful for me to better understand how the engine handles multithreading.

Is the primary advantage of PxTask that PhysX manages the threads for you? Is there any other reason I should prefer PxTask for specific tasks over spawning / joining my own threads?