Goshido / android-vulkan

This repository is a project for learning Vulkan API, constraint based 3D physics, Lua scripting, spatial sound rendering, HTML+CSS UI rendering.
30 stars 3 forks source link

Stable box stacking #19

Closed Goshido closed 2 years ago

Goshido commented 3 years ago

It's needed to develop stable box stacking velocity solver. Current implementation is unstable in the following configuration:

video link

Goshido commented 3 years ago

Some improvements have been implemented which increased simulation stability:

The video with results.

Goshido commented 3 years ago

It was found that it's needed a lot of inverse 4x4 matrix operations if I implement 4 contact manifold simultaneous calculation. There is an article with SSE optimizations. It's Fast 4x4 Matrix Inverse with SSE SIMD, Explained.

After analyzing and porting to Neon A64 I able to got result:

D/android_vulkan::C++: Results:
    >>>
        Matrices:            40000000 items
        Matrices data:        2.38419 Gb
        Classical:         2234873462 nanoseconds
        Neon:               867129923 nanoseconds
    <<<

D/android_vulkan::C++: Results:
    >>>
        Matrices:            40000000 items
        Matrices data:        2.38419 Gb
        Classical:         2243513539 nanoseconds
        Neon:               866007154 nanoseconds
    <<<

D/android_vulkan::C++: Results:
    >>>
        Matrices:            40000000 items
        Matrices data:        2.38419 Gb
        Classical:         2245197154 nanoseconds
        Neon:               874417000 nanoseconds
    <<<

D/android_vulkan::C++: Results:
    >>>
        Matrices:            40000000 items
        Matrices data:        2.38419 Gb
        Classical:         2214903462 nanoseconds
        Neon:               882053230 nanoseconds
    <<<

Benchmark infrastructure is in the commit 76ffcb7f7d9b76bf18ee7510b3239a47a33ce816.

Conclusions

Long story short: 2.5x boost compare to classical implementation in release build.

About debug

I got this results:

D/android_vulkan::C++: Results:
    >>>
        Matrices:            40000000 items
        Matrices data:        2.38419 Gb
        Classical:         5702306461 nanoseconds
        Neon:             45079743080 nanoseconds
    <<<

In disassembly I saw a lot of vector register to memory stack uploads. Basically they are inserted by compiler after every intrinsic operation. This slows execution a lot. But at the bright side it allows to see the value of registers in the debugger.

Goshido commented 3 years ago

It was an attempt to implement new velocity solver for 4-contact manifold. The main idea is to solve normal force impulse for 4 contact in manifold in one iteration. Sequential Impulse approach could be rewritten to matrix form which allows solving several constraints in one shot.

Implementation

This approach requires the computation of inverse operation of the 4x4 matrix. It was created a scene with three rigid bodies:

The video with simulation result is here.

For the purpose of topic the physics time was slow down by factor of 0.03.

Investigation

As you can see on the video the solver is unstable. The reason is effective mass matrix structure.

Direct effective mass matrix:

x
1.52754235 0.292016238 -1.01274192 0.222784221
0.292016238 1.52754235 0.222784221 -1.01274192
-1.01274192 0.222784221 1.52754235 0.292016238
0.222784221 -1.01274192 0.292016238 1.52754235

Inverse effective mass matrix:

x
1740406.75 -1740406.75 1740406.63 -1740406.75
-1740406.75 1740406.75 -1740406.75 1740406.63
1740406.63 -1740406.75 1740406.75 -1740406.75
-1740406.75 1740406.63 -1740406.75 1740406.75

So, inverse matrix looks very suspicious. Indeed with help of this resource the direct matrix rank is very close to 3 instead of 4. Just look at the last row. It's very close to 0.

x
1.52754235 0.292016238 -1.01274192 0.222784221
0 1.4717183767689647727 0.41638741998384287675 -1.0553309913289753295
0 0 0.73829999241024158782 0.73830005041024226845
0 0 0 -0.00000011600000591764

This basically means that direct effective mass matrix is not invertible. Also it was found a similar topic on Bullet forum. Dirk Gregorius mentioned that 4 point manifold system is already linear dependent.

Conclusion

Direct method can't be used to improve stacking box system stability due special system properties which are incompatible with algorithm.

Goshido commented 2 years ago

It was found that Baumgarte term has dependency with body angular and linear velocities. That velocities are updated in each iteration of Sequntial Impulse solver. But previous implementation evaluated Baumgarte term only once.

It was decided to implement updating logic for Baumgarte term and check solver quality.

For that purpose it was created the following scene:

The old approach video is here. The new approach video is here.

Conclusion

As you can see the new approach is slightly more stable.

Goshido commented 2 years ago

It was decided to compare current implementation with Bullet v3.20 engine. It was recreated test scene with 6 box stack. All masses and shapes are the same. The simulation was done at 60 Hz. The SI solver has 7 iterations. Other solver constants have been tweaked to android_vulkan project. The sleeping policy has been disabled.

The video with comparison is here.

Conclusion

As you can see the Bullet implementation is not stable with same configuration. Bullet could achieve stable results after enabling sleeping policy.

Discovery

It was noticed that velocity solver order which is used in Bullet is different. The Bullet solves all normal impulses for all contacts and only than solves friction impulses. The same approach has been implemented in android_vulkan project. This gave more stable simulation results.

Goshido commented 2 years ago

It was decided to try a new idea for velocity solver to increase stability. Accumulated velocity change for normal impulse is calculated for whole manifold. After that accumulated velocity change is applied to manifold bodies. So such calculation is repeated for every manifold.

The next step is friction velocity change. It was tried to use same technique. Unfortunately, the trick with total impulse calculation does not work for friction. The simulation is very unstable for some reason. There is no any explanation for this. So sticking with traditional velocity updating.

In result such approach produces more stable result that Bullet engine with same solver parameters and same scene. Note the sleeping policy is disabled.

The video with comparison is here.

Sleeping policy

The sleeping policy has been added to android-vulkan project. The RigidBody API has been slightly changed. The quality of simulation is acceptable. The video with quality is here.

Conclusion

The research is successfully finished. The progress has been merged to master development branch.