com-lihaoyi / upickle

uPickle: a simple, fast, dependency-free JSON & Binary (MessagePack) serialization library for Scala
https://com-lihaoyi.github.io/upickle
MIT License
707 stars 158 forks source link

Fix DoS vunerability with ujson.Obj and upack.Obj #449

Closed lolgab closed 1 year ago

lolgab commented 1 year ago

Fixes: #446

Replace scala.collection.mutable.LinkedHashMap with upickle.core.LinkedHashMap in ujson.Obj and upack.Obj

scala.collection.mutable.LinkedHashMap is not secure because of https://github.com/com-lihaoyi/upickle/issues/446 An attacker can create a sequence of keys that return the same hashCode. This PR replaces it with upickle.core.LinkedHashMap, a bespoke wrapper of java.util.LinkedHashMap which extends scala.collection.mutable.Map since java.util.LinkedHashMap is safe against this kind of hash-collision attacks.

lolgab commented 1 year ago

I made the Obj constructor private and removed case so there is no generated apply method. I want to keep a private constructor to wrap a Map so the internal implementation can take advantage of it, while make sure users can't wrap a map that doesn't keep ordering without copying it to a LinkedHashMap. I privatized the other constructors for consistency since apply is the documented way to instantiate the object and having some that can do new Thing and some that can't seemed inconsistent. It's not my goal to improve the binary compatibility.

lihaoyi commented 1 year ago

Rather than privatizing all the constructors, could we just publicize upickle.internal.LinkedHashMap as upickle.StringHashMap and make it the official type signature case class Obj(value: StringHashMap)? That would greatly reduce the churn in the API, keep everything as "simple" case classes, while still leaving us the option to update the StringHashMap implementation in future (e.g. to specialize it for strings for perf)

lolgab commented 1 year ago

@lihaoyi We also have upack.. So maybe it can be an upickle.core.internal.LinkedHashMap and two public implementations, one with [String, ujson.Value] and one with [upack.Msg, upack.Msg]? But StringHashMap as a name doesn't communicate the fact that the value is of type ujson.Value. Maybe something like ujson.ObjMap can be better? Or making ujson.Obj itself extend internal.LinkedHashMap[String, Value]? We could provide .value as just a method like def value = this, but this would add a lot of bloat from scala.collection.mutable.Map to Obj API, which maybe you don't want to be exposed to if you don't call .value.

lihaoyi commented 1 year ago

I don't think we need to communicate it's only used in ujson.Value. As you said, it's used in upack as well. It can just be a generic "mutable string map preserving input ordering" class.

StringMap or ObjMap both sound good to me

I don't think we should overhaup the ujson.Obj API in this PR. Not because it doesn't need work, just that that work is pretty orthogonal to fixing this particular performance issue. Regardless, if we wanted to add more helpers on ujson.Obj/ujson.Value it should be possible later without bincompat breakage

lolgab commented 1 year ago

The problem is that upack doesn't have String as the key type but upack.Msg

lefou commented 1 year ago

If this map is a pragmatic mostly internal structure, lets find a pragmatic name paired with some ScalaDoc. What about UjsonLinkedHashMap, it transports its properties (hash map, linked) but also that it's tied to u{json,pack,pickle}. The curious user can read the scaladoc to read, why it is needed and whether it is safe to be used in other places.

lihaoyi commented 1 year ago

sounds good. I think just upickle.core.LinkedHashMap would be fine as well, so not specific to either ujson or upack

lolgab commented 1 year ago

@lihaoyi @lefou Ready for another review round

lolgab commented 1 year ago

Thank you very much @lefou! I updated the PR description.