capnproto / capnproto-java

Cap'n Proto in pure Java
Other
397 stars 85 forks source link

Effective max List of Struct size? #92

Closed mooreniemi closed 3 years ago

mooreniemi commented 3 years ago

Hi, I'm trying to confirm my understanding of a current limitation in the codebase.

I have a Message which has a List of Structs in it. (Each Struct is 2 ints.) If a Message is constructed with a list of more than 33,554,432 (you may recognize this number) elements in it we'll throw IndexOutOfBoundsException with this stacktrace:

 at java.nio.Buffer.checkIndex(Buffer.java:546)
        at java.nio.HeapByteBuffer.putInt(HeapByteBuffer.java:381)
        at org.capnproto.StructBuilder._setIntField(StructBuilder.java:110)

(It's simple to set up a test case which attempts to set the 33,554,433rd element and fails. In my case I hit an outlier in my data I didn't expect!)

Looking at checkIndex, I can see it has indeed gone massively negative to -268,435,440 (close to the List size * 8, which is 268,435,456). Following through the debugger, I think this is because data gets set to that value, because the indexBit has overflowed:

    protected final <T> T _getStructElement(StructBuilder.Factory<T> factory, int index) {
// 33,554,432 * 64 = 2,147,483,648 > 2,147,483,647 -> indexBit = -2,147,483,584
        int indexBit = index * this.step;
        int structData = this.ptr + indexBit / 8 ;
        int structPointers = (structData + (this.structDataSize / 8)) / 8;

        return factory.constructBuilder(this.segment,
                                        structData,
                                        structPointers,
                                        this.structDataSize,
                                        this.structPointerCount);
    }

// max int is 2147483647
// max float is 3.4028235E38
// list size as int * 64 is -2147483584
// list size as float * 64 is 2.14748365E9

I'm moving away from the List pattern given my data is not well structured for it, but I wanted to confirm my understanding here. From my understanding of the purpose of CapnProto this is not necessarily a bug, as most records should have less than 33 million list elements, but it might be worth documenting the effective limitations to make clear for other users or put bounds checks somewhere earlier in the List initialization. If my issue does nothing else, maybe it helps the next person who trips over this error.

Thanks again for your work. :)

mooreniemi commented 3 years ago

Oh!! I see I am just on a super old version before the int overflow fix! Sorry for my mistake, closing.