hashberg-io / dag-cbor

Python implementation of the DAG-CBOR codec for IPLD.
https://dag-cbor.readthedocs.io
MIT License
13 stars 6 forks source link

Infinite recursion loop #14

Open snarfed opened 3 months ago

snarfed commented 3 months ago

Hi all! First off, thank you for maintaining dag-cbor. It's great!

I came across an encoded DAG-CBOR object that makes dag-cbor recurse infinitely, or at least deeper than Python's default max recursion depth. Here it is, base64-encoded: bad.b64.txt

Here's the stack trace snippet:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 127, in decode
    data, _ = _decode_item(stream, options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 149, in _decode_item
    value, num_bytes_further_read = _decoders[major_type](stream, arg, options)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 272, in _decode_dict
    v, _ = _decode_item(stream, options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 149, in _decode_item
    value, num_bytes_further_read = _decoders[major_type](stream, arg, options)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 231, in _decode_list
    item, _ = _decode_item(stream, options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 149, in _decode_item
    value, num_bytes_further_read = _decoders[major_type](stream, arg, options)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryan/src/bridgy-fed/local/lib/python3.11/site-packages/dag_cbor/decoding/__init__.py", line 231, in _decode_list
    item, _ = _decode_item(stream, options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...

You're not the only ones, whatever JS lib @ericvolp12 is using on atproto.tools has the same problem 😆: https://atproto.tools/records?did=did%3Aplc%3A4agp2hcrps6ou4vjv7uux7dm

Thanks in advance!

sg495 commented 3 months ago

It's a recursion depth issue: the current implementation of encoding and decoding in this library is recursive: objects with hundreds of levels of nesting, like the object you linked, exhaust the available recursion depth (which is 1000) in my machine.

Fixing this will require a non-recursive re-design of the library: it was not an issue for us until now, but I guess we'll get it done at some point in the near future :)

snarfed commented 3 months ago

Thanks! Totally makes sense. Not a high priority on our end, looks like this object was specifically crafted to push the boundaries, not representative of the kinds of objects we usually see.