FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

avro nesting checks #364

Closed pjfanning closed 1 year ago

pjfanning commented 1 year ago
cowtowncoder commented 1 year ago

Hmmh. Whether Avro Schemas allow unlimited nesting... I am not 100% sure.

cowtowncoder commented 1 year ago

I think it's a good idea to see if a test can be created; and if not maybe postpone until someone actually reports the problem.

pjfanning commented 1 year ago

https://www.cvedetails.com/vulnerability-list/vendor_id-45/product_id-76652/year-2022/Apache-Avro.html -- seems that Avro's own impls have had CVEs reported. May not be related to problematic inputs.

cowtowncoder commented 1 year ago

@pjfanning Too bad descriptions do not really show much about actual issue, wrt implementation. Seems more likely some validation of things like encoded lengths is not done.

cowtowncoder commented 1 year ago

Ok, I think Avro does allow recursive types; f.ex this

https://kb.databricks.com/en_US/data-sources/recursive-references-in-avro-schema-are-not-allowed

suggest that simple self-references are allowed. I think same goes for Protobuf as well. But being schema-based, I think that potential for DoS is lower because attacker must target specific schema type. Still, I assume tree-style structures are fairly common so maybe it makes sense to add checks, given that it is not a ton of code (more so for Avro just to expose IOException but otherwise).

pjfanning commented 1 year ago

@cowtowncoder I'm happy to merge this as is but I'm not sure I have the Avro expertise to produce a test case. I think you are right that having sensible AvroSchemas should protect users anyway.

The gain from adding this PR is that the exceptions are exposed. The implementation can be changed if anyone can provide a real use case that the code doesn't protect against. The values are configurable, so if someone gets a StreamConstaintsException and they still think the input is valid - they can increase the limits by tuning the StreamReadConstraints.

The potential loss is that this PR could turn out to have bugs in it that mean StreamReadConstraints tuning won't help.