cosmos / iavl

Merkleized IAVL+ Tree implementation in Go
Apache License 2.0
409 stars 256 forks source link

Change IAVL node.writeBytes to take in bytes.Buffer #891

Closed ValarDragon closed 3 months ago

ValarDragon commented 4 months ago

Every usage of node.WriteBytes we use takes in an io.Writer, not a bytes.Buffer.

We currently are spending 2.1 seconds out of a 2000 block sync on getting and returning buffer pools in node.WriteBytes. (Due to PutVarint calls) I think we should be able to remove this time entirely, by making PutVarint methods that just call bytes.Buffer.WriteByte directly, rather than using a slice from the sync.Pool and then using the generic io.Writer(WriteBytes) which is a second memmove.

odeke-em commented 3 months ago

@ValarDragon your suggestion as is will cause huge API breakages and also are quite limiting to just bytes.Buffer; The much simpler and direct/less-invasive/elegant alternative here is to interface test if the writer is an io.ByteWriter aka implements WriteByte which will then accomodate a whole lot of writers and won't even have to change the API at all so that'll become this

// EncodeVarint writes a varint-encoded integer to an io.Writer.
func EncodeVarint(w io.Writer, i int64) error {
        if bw, ok := w.(io.ByteWriter); ok { // We can take advantage of the ByteWriter interface
                return fVarintEncode(bw, i)
        }

        // In this case we have to write to a byte buffer then to the writer finally.
        // Use a pool here to reduce allocations.
        //
        // Though this allocates just 10 bytes on the stack, doing allocation for every calls
        // cost us a huge memory. The profiling show that using pool save us ~30% memory.
        //
        // Since when we don't have concurrent access to the pool, the speed will nearly identical.
        // If we need to support concurrent access, we can accept a *[binary.MaxVarintLen64]byte as
        // input, so the caller can allocate just one and pass the same array pointer to each call.
        buf := varintPool.Get().(*[binary.MaxVarintLen64]byte)

        n := binary.PutVarint(buf[:], i)
        _, err := w.Write(buf[0:n])

        varintPool.Put(buf)

        return err
}

func fVarintEncode(bw io.ByteWriter, x int64) error {
        // Firstly convert it into a uvarint
        ux := uint64(x) << 1
        if x < 0 {
                ux = ^ux
        }
        for ux >= 0x80 { // While there are 7 or more bits in the value, keep going
                // Convert it into a byte then toggle the
                // 7th bit to indicate that more bytes coming.
                // byte(x & 0x7f) is redundant but useful for illustrative
                // purposes when translating to other languages
                if err := bw.WriteByte(byte(ux&0x7f) | 0x80); err != nil {
                        return err
                }
                ux >>= 7
        }

        return bw.WriteByte(byte(ux & 0x7f))
}

but just beware that hand-rolling varint encoding and decoding can have unlikely consequences per my advisory https://cyber.orijtech.com/advisory/varint-decode-limitless

odeke-em commented 3 months ago

@ValarDragon please take a look at https://github.com/cosmos/iavl/pull/917

ValarDragon commented 3 months ago

Thanks for fixing/speeding this up!