msgpack / msgpack-ruby

MessagePack implementation for Ruby / msgpack.org[Ruby]
http://msgpack.org/
Apache License 2.0
758 stars 117 forks source link

Hash string values serialized as buffer #359

Closed DougEdey closed 3 months ago

DougEdey commented 3 months ago

Running on msgpack-ruby 1.7.2

I'm working on serializing some objects so we can pass them to a lua script (which uses cmsgpack in Redis) and that doesn't have support for extensions.

So I'm creating a GenericMessagePacker helper class:

# frozen_string_literal: true

# There are scenarios where we can't use the default message packer because Rails
# registers types that we can't register in lua-cmsgpack
class GenericMessagePacker
  def self.pack(obj)
    factory.dump(obj)
  end

  def self.unpack(str)
    factory.load(str)
  end

  def self.factory
    @factory ||= MessagePack::Factory.new
  end
end

The object I'm trying to serialize is the attributes from an ActiveRecord model (using attributes(enums_as: :symbols))

[{"id"=>1, "depth"=>"infinity", "type"=>"write", "path"=>"file.txt", "scope"=>"shared", "timeout_at"=>1717741411, "token"=>"763e15d6-5ad9-4961-82db-93fa7d561caa", "created_at"=>1717698211, "updated_at"=>1717698211, "site_id"=>1, "user_id"=>1}]

Nothing particularly fancy, but for some reason, when I serialize this object, I get a messagepack object which has a buffer in it (according to this converter)

kYuiaWQBpWRlcHRoqGluZmluaXR5pHR5cGXEBXdyaXRlpHBhdGioZmlsZS50
eHSlc2NvcGXEBnNoYXJlZKp0aW1lb3V0X2F0zmZiq/CldG9rZW7ZJGJlMWQ0
OTFmLWY0MTItNDNlNi1iYTA2LThlOGM4NGYzNDQ2Y6pjcmVhdGVkX2F0zmZi
AzCqdXBkYXRlZF9hdM5mYgMwp3NpdGVfaWQBp3VzZXJfaWQB
[
    {
        "id": 1,
        "depth": "infinity",
        "type": {
            "type": "Buffer",
            "data": [
                119,
                114,
                105,
                116,
                101
            ]
        },
        "path": "file.txt",
        "scope": {
            "type": "Buffer",
            "data": [
                115,
                104,
                97,
                114,
                101,
                100
            ]
        },
        "timeout_at": 1717742576,
        "token": "be1d491f-f412-43e6-ba06-8e8c84f3446c",
        "created_at": 1717699376,
        "updated_at": 1717699376,
        "site_id": 1,
        "user_id": 1
    }
]

The very odd bit to me, since I don't have any types registered, and the buffer cannot be decoded by lua-cmsgpack

Even more peculiarly, if I create the hash from scratch:

> obj2 = [{"id"=>1, "depth"=>"infinity", "type"=>"write", "path"=>"file.txt", "scope"=>"shared", "timeout_at"=>1717742576, "token"=>"be1d491f-f412-43e6-ba06-8e8c84f3446c", "created_at"=>1717699376, "updated_at"=>1717699376, "site_id"=>1, "user_id"=>1}]
> puts Base64.encode64(pool.dump(obj2))
kYuiaWQBpWRlcHRoqGluZmluaXR5pHR5cGWld3JpdGWkcGF0aKhmaWxlLnR4
dKVzY29wZaZzaGFyZWSqdGltZW91dF9hdM5mYqvwpXRva2Vu2SRiZTFkNDkx
Zi1mNDEyLTQzZTYtYmEwNi04ZThjODRmMzQ0NmOqY3JlYXRlZF9hdM5mYgMw
qnVwZGF0ZWRfYXTOZmIDMKdzaXRlX2lkAad1c2VyX2lkAQ==

This does return the standard strings:

[
    {
        "id": 1,
        "depth": "infinity",
        "type": "write",
        "path": "file.txt",
        "scope": "shared",
        "timeout_at": 1717742576,
        "token": "be1d491f-f412-43e6-ba06-8e8c84f3446c",
        "created_at": 1717699376,
        "updated_at": 1717699376,
        "site_id": 1,
        "user_id": 1
    }
]

What am I missing here? Why am I seeing buffer as a type?

DougEdey commented 3 months ago

Update, it seems like it's using write_bin under the hood, this is due to the string being encoding as ASCII-8BIT, I'll force some encoding here to avoid this

byroot commented 3 months ago

👋 😄