qmuntal / gltf

Go library for encoding glTF 2.0 files
https://www.khronos.org/gltf/
BSD 2-Clause "Simplified" License
242 stars 32 forks source link

Buffer Data memory usage #50

Closed zvdm closed 2 years ago

zvdm commented 2 years ago

Hi, We've found one more interesting behavior of the glTF document unmarshaling. In my project, I have to hold very big unmarshalled glTF in the gltf.Document struct in the memory. After uploading glTF to the gltf.Document structs the application consumed quite a lot of the memory.

Screenshot 2022-02-02 at 13 57 37

I've profiled this case and found the reason. During handling glTF the decoder (decode.go) calls the following function for each buffer

for i := externalBufferIndex; i < len(doc.Buffers); i++ {
    if err := d.decodeBuffer(doc.Buffers[i]); err != nil {
        return err
    }
}

In the function decodeBuffer we can see the following lines of code

} else if err = validateBufferURI(buffer.URI); err == nil {
    buffer.Data = make([]byte, buffer.ByteLength)
    err = d.ReadHandler.ReadFullResource(buffer.URI, buffer.Data)
}

which allocate a block of memory (a bytes array) for each asset in the glTF's buffers according to buffer.ByteLength and pass this bytes array to the ReadFullResource function. I have pretty huge glTF documents with many assets (in summary about a few gigabytes), which are stored in the cloud - that's why I skip the execution of the ReadFullResource function overriding it via gltf.NewDecoder(...).WithReadHandler(CustomReadHandler{}), the custom ReadFullResource function returns nil. I've wanted to free the memory allocated for buffer.Data in the custom ReadFullResource function but the bytes array (from buffer.Data) is copied to the ReadFullResource function (created a new variable which refers to the same block of memory). That's why in my case after executing the line of code

err = d.ReadHandler.ReadFullResource(buffer.URI, buffer.Data)

the buffer.Data has the non-empty value (actually all nulls) with full length and capacity like the appropriate buffer.ByteLength. Also, I've written a test case to show the current described behavior.

type mockSkipReadHandler struct{}

func (m mockSkipReadHandler) ReadFullResource(_ string, _ []byte) error {
    return nil
}

func TestDecoder_decodeBuffer_SkipReader(t *testing.T) {
    type args struct {
        buffer *Buffer
    }
    tests := []struct {
        name    string
        d       *Decoder
        args    args
        want    []byte
        wantErr bool
        length  int
    }{
        {"skipReader", NewDecoder(nil).WithReadHandler(&mockSkipReadHandler{}), args{&Buffer{ByteLength: 10000, URI: "a.bin"}}, make([]byte, 10000), false, 10000},
    }
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            if err := tt.d.decodeBuffer(tt.args.buffer); err != nil {
                t.Errorf("[SKIP-test] Decoder.decodeBuffer() in error = %v, wantErr %v", err, tt.wantErr)
                return
            }
            if !reflect.DeepEqual(tt.args.buffer.Data, tt.want) {
                t.Errorf("[SKIP-test] Decoder.decodeBuffer() buffer = %v, want %v", string(tt.args.buffer.Data), tt.want)
            }
            if len(tt.args.buffer.Data) != tt.length {
                t.Errorf("[SKIP-test] Decoder.decodeBuffer() length = %v, want %v", len(tt.args.buffer.Data), tt.length)
            }
            if cap(tt.args.buffer.Data) != tt.length {
                t.Errorf("[SKIP-test] Decoder.decodeBuffer() capacity = %v, want %v", cap(tt.args.buffer.Data), tt.length)
            }
        })
    }
}

If I comment these two lines of code (lines 181 and 182 in decode.go) to skip memory allocating to the buffer.Data variable and calling of the ReadFullResource function I get impressive memory results

Screenshot 2022-02-02 at 13 57 59

Could I suggest a little change so anyone who has a problem like me can solve it via the custom ReadFullResource function? Maybe pass the buffer.Data variable by the reference (like err = d.ReadHandler.ReadFullResource(buffer.URI, &buffer.Data) in decode.go) to have an availability to free the memory (like *data = nil in the custom ReadFullResource function)? If you agree, could I create the appropriate PR?

Thank you, in advance!

qmuntal commented 2 years ago

Thanks for the detailed explanation @zvdm. Your use case makes all sense and gltf should support it. In fact I've been playing with the fs.FS interface lately and I think that the changes I've done will cover what you want to do.

Can you try using https://github.com/qmuntal/gltf/tree/fs and report back? Notice that WithReadHandler has been replaced by NewDecoderFS and that the ReadHandler has been replaced by a plain fs.FS.

The external buffer docking now looks like: https://github.com/qmuntal/gltf/blob/61a7acc5e0f812fa5be47fbb6a2b1b835f807764/decoder.go#L165-L168

With this change one could craft an fs.FS which implements the fs.ReadFileFS and returns an empty byte slice for the desired external assets.

zvdm commented 2 years ago

Hi @qmuntal, thanks a lot for your advice! It's working. If we create a custom overridden fs.Fs like

type CustomFS struct{}

func (c CustomFS) Open(_ string) (fs.File, error) {
    return nil, nil
}

func (c CustomFS) ReadFile(_ string) ([]byte, error) {
    return nil, nil
}

we can use it while creating a new decoder

decoder := gltf.NewDecoderFS(bytes.NewReader(buff.Bytes()), CustomFS{})

and all is great. That's very cool! I guess this issue can be closed, thank you one more time.