Closed GoogleCodeExporter closed 9 years ago
Original comment by erwin.coumans
on 21 Dec 2009 at 11:00
The source code in Bullet/Extras/sph contains 3 files (radixsort*) with NVIDIA
copyright and no additional information that would make these files free (no
explicit rights to use/modify/distribute). IANAL but it creates a legal problem.
Original comment by d...@danny.cz
on 26 Jul 2010 at 11:18
The license should be fine, we'll commit the license files soon.
"License: Subject to the terms of this Agreement, NVIDIA hereby grants to
Developer a royalty-free, non-exclusive license to possess and to use the
Materials. Developer may install and use multiple copies of the Materials on a
shared computer or concurrently on different computers, and make multiple
back-up copies of the Materials, solely for Licensee’s use within
Licensee’s Enterprise. “Enterprise” shall mean individual use by Licensee
or any legal entity (such as a corporation or university) and the subsidiaries
it owns by more than 50 percent. The following terms apply to the specified
type of Material:
Source Code: Developer shall have the right to modify and create derivative
works with the Source Code. Developer shall own any derivative works
("Derivatives") it creates to the Source Code, provided that Developer uses the
Materials in accordance with the terms and conditions of this Agreement.
Developer may distribute the Derivatives, provided that all NVIDIA copyright
notices and trademarks are used properly and the Derivatives include the
following statement: "This software contains source code provided by NVIDIA
Corporation."
Also, if there is enough interest, we received another SPH implementation from
AMD, under the ZLib license.
Thanks for the feedback
Original comment by erwin.coumans
on 26 Jul 2010 at 3:29
Attachments:
Please add in a branch the SPH implementation from AMD!
Original comment by rtf...@gmail.com
on 2 Aug 2010 at 4:08
I'm continuing to make progress on this at
https://github.com/rtrius/Bullet-FLUIDS.
At this point, the fluid system has been completely refactored and
is more or less ready for direct integration into Bullet.
Using Bullet 2.x's architecture, the general design would seem to be:
btCollisionObject -> btFluidSph
btCollisionShape -> btFluidSphCollisionShape
btCollisionAlgorithm -> btFluidSphRigidCollisionAlgorithm
btCollisionWorld -> btFluidRigidDynamicsWorld
btCollisionConfiguration -> btFluidRigidCollisionConfiguration
However, there would be modularity issues with Fluid-Soft body
interaction(btSoftFluidRigid dynamics world/collision configuration?).
Would this design remain advisable under Bullet 3.x,
or should a different hierarchy be implemented?
Additionally, I am interested in importing the OpenCL infrastructure
from the repository at https://github.com/erwincoumans/experiments/.
In particular, I would like to import the radix sort(btRadixSort32CL).
Could you clarify which portions of the repository are under Zlib?
Some other questions:
-Why does Win32ThreadSupport use btAlignedFree() for
its delete functions, while PosixThreadSupport uses delete?
Are there any adverse effects to using btAlignedAlloc/btAligedFree() on Posix?
-Where should ATTRIBUTE_ALIGNED16() be used?
Thanks,
rtrius
Original comment by rtr...@gmail.com
on 12 Oct 2012 at 12:45
Thanks for the update. So this work would be a Bullet/Demos/FluidDemo, right?
It doesn't modify the Bullet/src sdk?
Do you have any videos of your work?
The radix sort, and all of the OpenCL work under the OpenCL folder in that repo
is under the zlib license.
Posix should also use btALignedAlloc/. Use ATTRIBUTE_ALIGNED_16 whenever the
class/struct needs to be 16 byte aligned, for example because it contains a
SIMD vector (btVector3 etc).
Original comment by erwin.coumans
on 12 Oct 2012 at 1:02
Here are some more recent videos:
Screen space fluid rendering
http://youtu.be/UDbzwVNWkcI
Partial 2 way interaction with penalty forces
(still experimenting with impulses)
http://youtu.be/euQNnBcSlqQ
It currently does not modify Bullet/src as the integration is not
complete. For instance, there is a 'FluidSph' class, analogous to
BulletSoftBody's btSoftBody, that has yet to be connected to
btCollisionObject.
With your approval, I'll implement the classes above(btFluidSph, etc.)
and move the files in FluidDemo/Fluids to a 'BulletFluids' folder
that may be dropped into Bullet/src.
For the near future, though, I would prefer to keep the files in
FluidDemo/ as fewer premake files need to be modified.
Original comment by rtr...@gmail.com
on 12 Oct 2012 at 5:16
Hi all,
Nice progress on this. Over at the OpenWorm project we've been working on some SPH code that integrates with OpenCL and handles soft body interactions with liquids:
http://youtu.be/jv4OMukQNF0
This is the work of Andrey Palyanov who I pointed at this issue and who will be following it. He started this work prior to Bullet having much SPH in it, so currently there is no Bullet integration. I'm writing to you here to see if it makes sense to consider an integration.
Does this look like something worth pursuing?
Thanks,
Stephen
Original comment by stephen....@gmail.com
on 12 Oct 2012 at 7:19
Also, the implementation is up here:
https://github.com/openworm/Smoothed-Particle-Hydrodynamics
Original comment by stephen....@gmail.com
on 12 Oct 2012 at 7:20
Stephen:
Sorry for the late reply, but I'm not really familiar with Bullet's soft body implementation.
I'm unsure whether it would be sufficiently accurate for your simulation.
According to:
http://www.bulletphysics.org/Bullet/phpBB3/viewtopic.php?f=22&t=4084&p=15193&hilit=position+based+dynamics#p15193
Bullet's soft bodies are partially based on "Advanced Character Physics" by Thomas Jakobsen,
so an integration might not be straightforward.
Progress update:
I've decided to implement the design above anyways
(seems to be the best option without changing the architecture).
To give a summary on the current state of the project,
the main improvements/features(aside from a redesigned interface and
refactoring) are:
-Replacement of the explicit linked list grid with an implicit, statically sized grid that supports larger worlds.
(1024^3 or 2^21^3 cells; see comments in btFluidSortingGrid.h for details)
-Single threaded performance ~3.5x faster(~5x including stiffness increase)
Timings for grid update and SPH density/force calculation
with 16384 particles, single threaded, on a Phenom II x4 2.8GHz, after:
~145ms -> ~95ms - using structure of arrays instead of array of structures
~95ms -> ~60ms - symmetric SPH density calculation
~60ms -> ~53ms - symmetric SPH force calculation
~53ms -> ~40ms - switching from 2r grid cells(2^3 queried) to r grid cells(3^3 queried)
~40ms -> ~30ms - increasing SPH stiffness(a single btScalar), which reduces the number of neighbor particles
-Multithreaded SPH force/density calculation, using parallel for
With 3 threads performance is doubled, leading to an overall gain of ~9x or ~17ms per SPH step
with 16384 particles, and ~22ms per internalSingleStepSimulation(), including collisions
(~6ms for a single large box rigid body and AABB boundary, single-threaded) and integration.
-OpenCL accelerated solver(SPH force and density is calculated on GPU, rest on CPU)
-Cannot perform all steps on GPU as the data transfer takes too long, and
it is necessary to keep data on the CPU for collisions.
-OpenCL grid update does not begin to outperform the CPU version until ~32768 particles or more are used,
and the radix sort would have to be extended to 64 bits to allow 2^21^3 cells.
16384 particles, Radeon 5850(~8.5 ms total):
~3.7 ms CPU grid update
~1.6 ms send position, velocity, and grid state to GPU
~2.3 ms SPH force calculation
~0.9 ms read SPH force from GPU
65536 particles, Radeon 5850(~29 ms total):
~16 ms CPU grid update (GPU update takes ~9ms, most time is spent reordering the data on both CPU and GPU)
~3 ms send position, velocity, and grid state to GPU
~8 ms SPH force calculation
~2 ms read SPH force from GPU
-2 way interaction with btCollisionObject/btRigidBody
-Fluid particles are treated as rigid bodies, using btSphereShape.
-Raycasting is used for CCD(only prevents fluid particles from tunneling through rigids;
fast moving rigids can tunnel through fluid).
-Collision response may use either:
Penalty forces, or
Impulses (impulse to remove normal velocity + penalty force for penetration)
Known issues/limitations:
-Although using raycasting for CCD is sufficient to prevent particles from tunneling
through static triangle meshes, moving trimeshes(using Gimpact) easily tunnel through particles.
(Not a major issue, could convert trimeshes using convex decomposition.)
-Due to some optimizations, undefined behavior occurs if particles
or rigid bodies reach the edges of the grid. (Should not be a problem with 2^21^3 cells.)
-Sleeping is not easily implemented; jitter from collisions with rigid bodies prevent
particles from falling below a reasonable velocity threshold. Stacking the fluid too high
also causes the SPH forces to become unstable.
-Although impulses are better than penalty forces overall, impulses suffer from jitter
even if the fluid is not stacked very high. This occurs regardless of whether projection
or penalty forces are used to remove penetration. Penalty forces also have some jitter,
but it is much less noticeable.
The jitter from impulses is reduced if:
-The amount of penetration removed per frame is reduced, which causes penetration issues, or
-The quality of contacts is improved, by using e.g. the btSphereBoxCollisionAlgorithm
instead of the btConvexConvexAlgorithm, but this does not resolve the issue completely.
-When 4 threads are used with the multithreaded solver, performance begins to decrease/fluctuate.
Some tests with Intel TBB showed similiar max speedup(as fast as 3 threads), so it is unlikely
to be related to the parallel for implementation. This issue may be processor/machine specific,
as using 4 threads for position/velocity integration also has the same problem.
Scaling with 16384 particles(Phenom II x4 2.8GHz):
~30ms with 1 thread
~20ms with 2 threads
~17ms with 3 threads
~15-26ms with 4 threads
Scaling with 65536 particles(Phenom II x4 2.8GHz):
~160ms with 1 thread
~100ms with 2 threads
~77ms with 3 threads
~70-100ms with 4 threads
There are also some improvements to the HeightfieldFluidDemo, but it is still
very experimental:
-Patched for bullet-2.81-rev2613.
-Extend btFluidHfBuoyantConvexShape to collide with other btCollisionObject/btRigidBody.
-Solver improvements(fixed 'fill pool' demo).
-Use textures for rendering, add option to render fluid as columns.
Note also that the heightfield fluid system is not tightly coupled to the SPH system, so it is not necessary
to wait for its completion before importing the SPH system(simply ignore files containing 'FluidHf').
The major issues that would have to be resolved(work in progress)
before the SPH system could be considered as a 'release candidate' are:
-Faster sphere collision algorithms
(especially trimesh/heightfield, which are very slow; ~20ms for 4096 spheres), and
-Improved impulses(current implementation does not reduce both penetration and jitter to acceptable levels).
Thanks,
rtrius
Original comment by rtr...@gmail.com
on 22 Jan 2013 at 3:12
Hello Erwin,
Some recent improvements are:
-Fixed most of the penetration and jitter issues by adding a margin of allowed
penetration (appears similar to the 2 videos posted above), but there is still
minor jitter when colliding with floating rigid bodies, as well as when there
are multiple contacts on a single SPH particle.
-Fixed multithreaded scaling(was actually caused by the parallel for).
With 4 threads, the result with the same processor above is a stable ~14ms for 16384 particles
and ~65ms for 65536 particles per SPH density/force step.
-Added a simplified sphere-heightfield algorithm, which is about 5x faster(~4ms for 4096 particles)
in the SPH-heightfield demo(not enabled by default, currently only supports Y+ == up).
-Added some optimizations for the OpenCL solver when the grid is updated on the GPU,
with a reduction of ~29ms -> ~12ms per SPH density/force step for 65536 particles on
the same hardware.
Questions:
1) I have some fixes for various minor issues I've encountered; the patches would be unrelated
to each other, but would only be a few lines long. Should I post them here,
or as a single new issue, or as several new issues?
2) Likewise, how should I handle other improvements that are not directly related to the SPH system?
(such as a unique() function for btAlignedObjectArray, btParallelFor, or sphere collision algorithms)
3) The uniform grid uses 'long long int' in order to support 2^21^3 cells.
Is this acceptable from a cross platform perspective?
4) Should btCollisionObject.m_collisionFlags be set in btFluidSph::btFluidSph()?
(btSoftBody::btSoftBody() does not set it either) Since it is currently not set,
'warning: static-static collision!' appears when running the demos in debug mode.
5) On CPU, it is more efficient to treat fluid particles as rigid bodies than converting rigids into particles.
When implementing on the GPU, is the collision detection fast enough to use the same method, or should rigids be
voxelized? Would the performance difference be worth the cost of maintaining a separate algorithm?
For rigid-rigid interaction, how does the OpenCL rigid body pipeline compare to using particles?
6) Are there any specialized or faster CPU collision algorithms for sphere-convex hull
and sphere-polyhedron that are worth looking into, or is GJK the best option?
7) I'm still confused by ATTRIBUTE_ALIGNED16. For instance, should it be:
-Used on any class containing a btVector3?
struct A //Use ATTRIBUTE_ALIGNED16 here?
{
btVector3 m_v;
};
-Used on any class that contains a class containing a btVector3?
struct B //Use ATTRIBUTE_ALIGNED16 here?
{
A m_a;
};
-btCollisionObject has ATTRIBUTE_ALIGNED16, while btRigidBody does not.
Does this mean that ATTRIBUTE_ALIGNED16 is inherited, or is it not used
on btRigidBody for another reason?
-Where would ATTRIBUTE_ALIGNED64 be used?
Thanks,
rtrius
Original comment by rtr...@gmail.com
on 26 Feb 2013 at 4:42
moved to https://github.com/bulletphysics/bullet3/issues/100
Original comment by erwin.coumans
on 30 Mar 2014 at 6:16
Original issue reported on code.google.com by
erwin.coumans
on 26 Oct 2009 at 6:30