Open Dennis4b opened 2 years ago
As a quick example:
scala> val enc = new EncoderSpeed()
val enc: boopickle.EncoderSpeed = boopickle.EncoderSpeed@d59b6cb
scala> enc.writeInt(2100000000)
val res37: boopickle.Encoder = boopickle.EncoderSpeed@d59b6cb
scala> val bb = enc.asByteBuffer
val bb: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=4 cap=512]
scala> val dec = new DecoderSpeed(bb)
val dec: boopickle.DecoderSpeed = boopickle.DecoderSpeed@5f7509c8
scala> dec.buf.remaining
val res38: Int = 4
scala> dec.readIntArray()
java.lang.OutOfMemoryError: Java heap space
Would it make sense, when allocating (larger) arrays, to check if this allocation is reasonable ?
So something like:
def readIntArray(len: Int): Array[Int] = {
+ assert(buf.remaining() >= len * 4) // bail out if not
val array = new Array[Int](len)
buf.asIntBuffer().get(array)
buf.position(buf.position() + len * 4)
array
}
I think this is a very good idea, surprised this is not already done.
In fact, this should be considered a security vulnerability, worthy of a CVE.
I like
Boopickle
for its size, efficiency, and the obfuscating side effect of not having to encode any structure or key names, as both sides know what the data represents, and I use it together withsloth
forscala.js <-> scala
communications without any issues.However, the client, and all that it sends, is untrusted, and I think it would therefore be helpful (apologies if I missed a previous discussion on this?) if a knowledgable insider could share some thoughts on possible concerns with untrusted input.
For me three issues come to mind:
1) Robustness against carefully crafted malformed data, basically the equivalent of trying to provoke a buffer overflow in
C/C++
. I would imagine that on the JVM all such attempts would result in a decoder exception, limited to tempory higher CPU/memory consumption?2) Avoiding denial of service attacks by for example abusing deduplication logic, for example creating a cycle of references, where a decoder could get stuck. (Of course the encoder is not going to create such a structure in normal use, but wether it is possible to manually construct a cycle or similar pathological case).
3) Avoiding
zip-bomb
kinds of attacks, passing very large size fields and/or using deduplication causing the server to allocate huge amounts of memory in preparation for large strings or buffers. It's common protection to limit the HTTP request size in the webserver/proxy, but this is ineffective if the a boopickle structure, itself of a normal size, has the potential to explode to a huge size during decoding.Any thoughts are very much appreciated!