creationix / msgpack-js

The msgpack protocol implemented in pure javascript.
http://msgpack.org/
MIT License
253 stars 48 forks source link

Undefined encodes strangely and is unreadable by other libraries #18

Open 1egoman opened 7 years ago

1egoman commented 7 years ago

Steps to reproduce:

$ node
> let m = require('msgpack-js')
{ encode: [Function], decode: [Function: decode] }
> m.encode({foo: undefined, hello: 'world'})
<Buffer 82 a3 66 6f 6f c4 a5 68 65 6c 6c 6f a5 77 6f 72 6c 64>
> m.encode({foo: undefined, hello: 'world'}).toString('base64')
'gqNmb2/EpWhlbGxvpXdvcmxk'
$ python
Python 2.7.9 (default, Dec  1 2016, 15:15:08) 
[GCC 5.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import msgpack
>>> import base64
>>> a = base64.b64decode('gqNmb2/EpWhlbGxvpXdvcmxk')                                                                                       
>>> msgpack.unpackb(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "msgpack/_unpacker.pyx", line 145, in msgpack._unpacker.unpackb (msgpack/_unpacker.cpp:145)
msgpack.exceptions.UnpackValueError: Unpack failed: error = 0
>>> 

Even if undefined isn't part of the msgpack spec, this is weird behavior.

pe-sean commented 7 years ago

Should it encode "undefined" as nil (0xc0) instead of bin8 (0xc4)?

https://github.com/msgpack/msgpack/blob/master/spec.md

1egoman commented 7 years ago

@pe-sean Possibly. But then information might be lost when something is encoded-then-decoded, correct? Also, is backwards-compatibility a concern?

$ # With your proposed change...
$ node
> let m = require('msgpack-js')
{ encode: [Function], decode: [Function: decode] }
> m.decode(m.encode({foo: undefined, hello: 'world'}))
{foo: null, hello: 'world'}

Another option could be a special flag / option that changes how undefined is encoded.

pe-sean commented 7 years ago

True. The ideal thing would be an 'undefined' value in the spec.

I am trying to decode messages encoded with this library from Python and this particular implementation fails there. My preference would be to still allow the messages to be parsed in other languages even though we lose something in the encode/decode cycle.

1egoman commented 7 years ago

@pe-sean I think I agree. Though I think it probably makes sense to add a warning to the readme or something because that may be unintuitive at first glance.

creationix commented 7 years ago

Agreed. Keep in mind this library was designed before the msgpack spec even had distinct binary and string types. I had to improvise by adding binary types so I decided to also add an encoding for undefined while I was at it.

Now that the new msgpack spec has been out for a while and has defined ways to add extended types, my old way is wrong and confusing. For the record, I had told the msgpack folks about my extension to the spec. They didn't consult me when designing the new spec or notify me when it was released. This is probably because there were lots of people complaining about the lack of distinct binary and string types and they didn't bother to carefully include everyone in the conversation.

More recently, I wrote this library for a project https://github.com/creationix/msgpack-es. It's fairly slow, but that's mostly due to browsers not yet optimizing let and for .. of. If you change those to plain old ES5 syntax it performs a lot better.

I also wrote this one https://github.com/creationix/revision/blob/master/src/libs/msgpack.ts

creationix commented 7 years ago

Actually, that second typescript library looks to be the first library repackaged.

pe-sean commented 7 years ago

No worries. And thanks for doing this in the first place. It worked great for us until we got into a multiple-language situation.

It sounds like there might be some challenges to changing this. We will move to another library so that our usage pattern works.

creationix commented 7 years ago

FWIW, JSON.stringify ignores keys with undefined values. We could do the same here and just bump the major version.

creationix commented 7 years ago

I guess the worry is what about people who depend on the undefined behavior (pun intended).

1egoman commented 7 years ago

@creationix FWIW, I originally came across this when I tried to msgpack an object with undefined keys, so at least for my use case that would work.