jrouwe / JoltPhysics

A multi core friendly rigid body physics and collision detection library. Written in C++. Suitable for games and VR applications. Used by Horizon Forbidden West.
MIT License
6.42k stars 414 forks source link

Slow HeightfieldShape Save/RestoreBinaryState #1080

Closed attilaz closed 4 months ago

attilaz commented 4 months ago

We have a 8192x8192 heightfield. RestoreBinaryState takes 1231ms on a Galaxy A51 phone.

Most of the time spent with reading vector elements (mRangeBlocks, mHeightSamples, mActiveEdges)

https://github.com/jrouwe/JoltPhysics/blob/40dc67068b853761baf262cc47e3cf567e2118e9/Jolt/Core/StreamIn.h#L44

I have modified the StreamIn vector reader to read the whole vector data with one call. This takes 114ms on the same phone.

I am not sure about is_trivial_v maybe it should be is_trivially_copyable?


template <class T, class A>
void                Read(std::vector<T, A>& outT)
{
    typename Array<T>::size_type len = outT.size(); // Initialize to previous array size, this is used for validation in the StateRecorder class
    Read(len);
    if (!IsEOF() && !IsFailed())
    {
        outT.resize(len);

        if (std::is_trivial_v<T>)
        {
            if (len > 0)
                ReadBytes(&outT[0], len * sizeof(outT[0]));
        }
        else
        {
            for (typename Array<T>::size_type i = 0; i < len; ++i)
                Read(outT[i]);
        }
        }
    else
        outT.clear();
    }

Our deserialization code looks like this. The ReadBytes is a virtual function which gets called with small sizes.

struct MyStreamInWrapper : public StreamIn
{
    MyStreamInWrapper(const uint8_t* start, const uint8_t* end) : _rover(start), _end(end)
    {
    }

    virtual void        ReadBytes(void* outData, size_t inNumBytes)
    {
        _eof = (_rover == _end);
        int copy = std::min(inNumBytes, (size_t)(_end - _rover));
        memcpy(outData, _rover, copy);
        _rover += copy;
    }

    /// Returns true when an attempt has been made to read past the end of the file
    virtual bool        IsEOF() const {
        return _eof;
            }

    /// Returns true if there was an IO failure
    virtual bool        IsFailed() const { return false; }

    const uint8_t* _rover;
    const uint8_t* _end;
    bool _eof = false;
        };

MyStreamInWrapper stream_in((uint8_t*)f->getData(), (uint8_t*)f->getData() + f->getDataSize());

// Load the shape
// If you have assigned custom ID's on save, you need to ensure that the shapes exist in this map on restore too.
JPH::Shape::ShapeResult result;

JPH::Shape::IDToShapeMap id_to_shape;
JPH::Shape::IDToMaterialMap id_to_material;
result = JPH::Shape::sRestoreWithChildren(stream_in, id_to_shape, id_to_material);

if (result.IsValid())
{
    restored_shape = result.Get();
}
jrouwe commented 4 months ago

That's a good find!

Unfortunately, the suggested solution won't work as e.g. Vec3 is is_trivial and is_trivially_copyable but it requires calling the custom Read(Vec3 &) function to avoid writing garbage for the unused W component.

Let me cook something up.

jrouwe commented 4 months ago

Should be fixed now, thanks for reporting this!