Multithreaded uses of bullet-xna in simultanious but separated worlds results in BulletXNA.LinearMath.PooledType corruption.

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1.  Start two threads that each create their own DiscreteDynamicsWorld
2.  Step the worlds Simultaniously on the separate threads repeatedly
3.  *unnecessary

What is the expected output? What do you see instead?

Expected: Worlds step happily.    

Result: BulletGlobals.PooledType<X> types get corrupted by one thread calling 
new T() and the other thread calling Free(T) making the T Get() call return a 
reference to null.

What version of the product are you using? On what operating system?
Latest dev (not stable version)

Please provide any additional information below.

An easy fix to this issue is to lock in PooledType<T> but the easiest fix isn't 
always the best option.   A lock in PooledType<T> would be very slow.    A 
better option is to more carefully select when PooledType<T> can and should be 
used and avoiding globals so the class data can be successfully encapsulated.

Original issue reported on code.google.com by tera...@gmail.com on 16 Aug 2013 at 7:32

GoogleCodeExporter commented 9 years ago

It might be helpful to think of the library in terms of the question;  What is 
the smallest unit of usefulness?  CollisionWorld?   DynamicsWorld?   
DiscreteDynamicsWorld?

Original comment by tera...@gmail.com on 16 Aug 2013 at 7:36

GoogleCodeExporter commented 9 years ago

Thanks for the comments. Unfortunately PooledType isn't the only issue 
regarding threading the bullet libraries . The best fix for PooledType would be 
a ThreadLocal / ThreadStatic pool system. I can add that (main reason it isn't 
there is that it's not available in the 360 .net libraries), but there are more 
fundamental issues in terms of the solver , particularly with adding / removing 
bodies to the world and the use of manifold points in persistent manifolds . 
If you have some example code I might be able to help a bit more.

Original comment by xexu...@gmail.com on 17 Aug 2013 at 5:28

GoogleCodeExporter commented 9 years ago

The other 'simple' thing you can try /do is just to change the pool allocators 
and frees so that they always create new objects. The GC in windows is a lot 
friendlier then the xbox one so you may get acceptable performance that way.

Original comment by xexu...@gmail.com on 19 Aug 2013 at 8:50

GoogleCodeExporter commented 9 years ago

Here's a patch that allows enabling/disabling of the pooling and/or locking.

Original comment by tera...@gmail.com on 20 Aug 2013 at 10:46

Attachments:

PooledType.cs.patch

GoogleCodeExporter commented 9 years ago

Just a note, even with that, it doesn't quite fix the issue.   
Dbvt.CollideTV(x,x,x) still suffers from the fact that it's called in multiple 
places using a static CollideTVStack property.

(with the patch, using the following statements from library consuming code)
BulletGlobals.VoronoiSimplexSolverPool.SetPoolingEnabled(false);
BulletGlobals.SubSimplexConvexCastPool.SetPoolingEnabled(false);
BulletGlobals.ManifoldPointPool.SetPoolingEnabled(false);
BulletGlobals.CastResultPool.SetPoolingEnabled(false);
BulletGlobals.SphereShapePool.SetPoolingEnabled(false);
BulletGlobals.DbvtNodePool.SetPoolingEnabled(false);
BulletGlobals.SingleRayCallbackPool.SetPoolingEnabled(false);
BulletGlobals.SubSimplexClosestResultPool.SetPoolingEnabled(false);
BulletGlobals.GjkPairDetectorPool.SetPoolingEnabled(false);
BulletGlobals.DbvtTreeColliderPool.SetPoolingEnabled(false);
BulletGlobals.SingleSweepCallbackPool.SetPoolingEnabled(false);
BulletGlobals.BroadphaseRayTesterPool.SetPoolingEnabled(false);
BulletGlobals.ClosestNotMeConvexResultCallbackPool.SetPoolingEnabled(false);
BulletGlobals.GjkEpaPenetrationDepthSolverPool.SetPoolingEnabled(false);
BulletGlobals.ContinuousConvexCollisionPool.SetPoolingEnabled(false);
BulletGlobals.DbvtStackDataBlockPool.SetPoolingEnabled(false);
BulletGlobals.BoxBoxCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.CompoundCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.ConvexConcaveCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.ConvexConvexAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.ConvexPlaneAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.SphereBoxCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.SphereSphereCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.SphereTriangleCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.GImpactCollisionAlgorithmPool.SetPoolingEnabled(false);
BulletGlobals.GjkEpaSolver2MinkowskiDiffPool.SetPoolingEnabled(false);
BulletGlobals.PersistentManifoldPool.SetPoolingEnabled(false);
BulletGlobals.ManifoldResultPool.SetPoolingEnabled(false);
BulletGlobals.GJKPool.SetPoolingEnabled(false);
BulletGlobals.GIM_ShapeRetrieverPool.SetPoolingEnabled(false);
BulletGlobals.TriangleShapePool.SetPoolingEnabled(false);
BulletGlobals.SphereTriangleDetectorPool.SetPoolingEnabled(false);
BulletGlobals.CompoundLeafCallbackPool.SetPoolingEnabled(false);
BulletGlobals.GjkConvexCastPool.SetPoolingEnabled(false);
BulletGlobals.LocalTriangleSphereCastCallbackPool.SetPoolingEnabled(false);
BulletGlobals.BridgeTriangleRaycastCallbackPool.SetPoolingEnabled(false);
BulletGlobals.BridgeTriangleConcaveRaycastCallbackPool.SetPoolingEnabled(false);
BulletGlobals.BridgeTriangleConvexcastCallbackPool.SetPoolingEnabled(false);
BulletGlobals.MyNodeOverlapCallbackPool.SetPoolingEnabled(false);
BulletGlobals.ClosestRayResultCallbackPool.SetPoolingEnabled(false);
BulletGlobals.DebugDrawcallbackPool.SetPoolingEnabled(false);

Original comment by tera...@gmail.com on 21 Aug 2013 at 12:19

GoogleCodeExporter commented 9 years ago

Looks like ConvexShape in ConvexConvexCollisionAlgorithm.ProcessCollision in 
min1 or min2 is null sometimes...  is the next thread related failure...

After that it's in BoxBoxDetector.GetClosestPoints, either box1 or box2 is null

After that it's in BulletCollision.UnionFind.Unite   m_elements.Count < I or 
m_elements.Count < j  so it generates an IndexOutOfRange exception.

After that it's in UnionFind.Find  Again, x > rawElements.Length so it throws 
an IndexOutOfRangeException

Seems like one pattern that fails is a static method that relies on a static 
object property.

Original comment by tera...@gmail.com on 21 Aug 2013 at 12:28

GoogleCodeExporter commented 9 years ago

Yes. there are a lot of places where thread safety fails, a lot of this comes 
from the original design goal where the structure of the code closesly follows 
the C++ version of bullet. Optionally pooling more of the static types would 
help but there are further stabililty issues, for example querying a bvhTree in 
one thread whilst modifying it (adding, removing objects to the world) in 
another.
I'll look at splitting off the changes you've mentioned to a threaded branch of 
the code.

Thanks.

Original comment by xexu...@gmail.com on 22 Aug 2013 at 11:43

GoogleCodeExporter commented 9 years ago

"for example querying a bvhTree in one thread whilst modifying it ".    

Yes.  The way that I generally assume physics libraries work is..    sectioned 
off by world.   This isn't always the case..  and, the normal use case...   the 
world is the smallest unit of usefulness in a physics library (although that 
isn't always the case, some people use the collision detection algorithms only, 
for example).    Is there considerable caching/alloc/GC benefit to static 
globals in the 'multiple running world' scenario or would it be the same as 
having a set of 'per-world' instances for that?      BTW, I know it would make 
the code easier to /write/ with BulletGlobals.Static PooledType<T> because it's 
able to be referenced anywhere, even where it wasn't planned.    I guess I just 
see the single world use-case has one world where the set of globals applies 
only to that world.    The multiple world use case requires multiple instances 
of those per world.

Disclaimer:  A friend of mine wrote a plug-in to a physics simulation that he 
named 'bulletsim' that uses the c++ version of Bullet via pInvoke and his own 
c++ stuff to static-ize the library so pInvoke works.   He and I conspired to 
build it in an API format so that the same simulation could be run in bullet 
and bullet-xna... without changing any of the consuming code...     and, it 
works, so congrats on that.    Two things, bullet-xna is much slower(to be 
expected) and has some challenges with threading that the c++ bullet is OK with 
like multiple worlds with a dedicated thread per world running at the same time.

Original comment by tera...@gmail.com on 24 Aug 2013 at 4:23

GoogleCodeExporter commented 9 years ago

FYI and proof to above statement :)

bulletsim's bullet-xna API implementation

http://opensimulator.org/viewgit/?a=viewblob&p=opensim&h=2a820bee164706d341bda32
2a381976dee27e6d6&hb=c34e6f25b1c8df14d62c102283efbf8ea263805c&f=OpenSim/Region/P
hysics/BulletSPlugin/BSAPIXNA.cs

bulletsim's bullet API implementation

http://opensimulator.org/viewgit/?a=viewblob&p=opensim&h=12a0c1717c75d33abb12a5a
ed03f8eb750c39aca&hb=c34e6f25b1c8df14d62c102283efbf8ea263805c&f=OpenSim/Region/P
hysics/BulletSPlugin/BSAPIUnman.cs

Original comment by tera...@gmail.com on 24 Aug 2013 at 4:29

GoogleCodeExporter commented 9 years ago

Thanks :)
It's a bit depressing the performance difference between the c++ and c# 
versions. I've done a lot of profiling work on bullet-xna and removed the worse 
cases but it's still probably an order of magnitude out. Bullet itself is 
pretty well optimised which makes some of the comparisons even worse :)  Still 
this is a fully managed version and it fit my original goal of getting bullet 
running on the 360. It also allowed me to do things like custom callbacks and 
made debugging games, demos a lot easier.
Not sure if you've come across it, but Andre's BulletSharp 
(https://code.google.com/p/bulletsharp/) takes a similar approach to your 
friends version in provideing a c++ binding.

WRT Multiple Physics World, as you;ve probably guessed it's not something I've 
tried but it would be an interesting test case to help minimise dependencies. 
I'll create a demo for it. What i've tended to do is just have one large 
physics world and add/remove objects from it as the player enters sections, or 
force objects into sleeping states to minimise the amount of work done.

Original comment by xexu...@gmail.com on 24 Aug 2013 at 3:18

GoogleCodeExporter commented 9 years ago

Better debugging, That's one of the reasons I was gunning for a managed bullet 
:)   If you've ever tried to debug uses of pInvoke into c++...    you know what 
I mean :)

Original comment by tera...@gmail.com on 24 Aug 2013 at 10:01

GoogleCodeExporter commented 9 years ago

Not sure if you;re interested but i've made a few changes and created a 
MultiThreaded branch . The main changes are that you can have multiple 
collision worlds, and each can have it's own independent set of pooled objects, 
the reference to this has been added off the IDispatcher interface as that was 
the most visible. I've also gone through and removed various static data 
structures (mainly in dbvt , axissweep and boxboxcollider) so they use pooled 
types of other appropriate value types.
Theres a new MultiThreadedWorld demo which has an example of running two worlds 
at once.
If you do give it a try can you let me know of any other threading issues you 
come across.

thanks.

Original comment by xexu...@gmail.com on 29 Aug 2013 at 7:03

GoogleCodeExporter commented 9 years ago

Sounds good thanks, 
I'll give it a try.  (back from a long trip)

Original comment by tera...@gmail.com on 18 Sep 2013 at 3:22

iamcsharper / bullet-xna

Multithreaded uses of bullet-xna in simultanious but separated worlds results in BulletXNA.LinearMath.PooledType corruption. #20