cbor / cbor.github.io

cbor.io web site
74 stars 33 forks source link

unclear when indefinite length could be bad #66

Closed aep closed 3 years ago

aep commented 3 years ago

there's some subtle hints about indefinite length items not being desirable in some circumstances, but it's very unclear why. Actually i'm unsure why i would ever use definite lengths at all.

Using indefinite-length encoding allows
an encoder to not need to marshal all the data for counting, but it
requires a decoder to allocate increasing amounts of memory while
waiting for the end of the item.

someone also complained on reddit (cant find the link anymore) that indefinite length items are a design mistake in cbor, but i can't figure out why. They seem fine.

cabo commented 3 years ago

Hi Arvid,

let's first distinguish indefinite length strings (which are a construct build on top of definite length strings) from indefinite length containers (arrays, maps). I think you are talking about the latter.

There are two reasons why definite length containers may be the better choice:

If these are not important to you, be our collective guest and go indefinite as much as you want -- that's why that is in there. (Unless you need deterministic encoding, which is biased towards definite, but of course you can define your own deterministic encoding rules.)

aep commented 3 years ago

thanks,

maybe found the answer in https://github.com/msgpack/msgpack/issues/128#issuecomment-21143606 That is some languages require realloc on array append, which might copy, so that's alot slower than a known size alloc.

curious why protobuf opted for not sizing arrays at all tho