kriszyp / cbor-x

Ultra-fast CBOR encoder/decoder with extensions for records and structural cloning
MIT License
264 stars 33 forks source link

safeKey in cbor-x encode / decode #73

Closed atomictag closed 5 months ago

atomictag commented 1 year ago

I stumbled upon this report https://github.com/advisories/GHSA-9c47-m6qq-7p4h and I was curious to see how cbor-x dealt with similar cases (with records support enabled).

const record = { foo: "bar", __proto__: { isAdmin: true } };
const enc = encode(record)
const dec = decode(record)
expect(dec).toEqual({ foo: "bar", undefined: true })

This does not feel quite right - albeit the dangerous __proto__ is correctly stripped. The encoded CBOR is d9dfff8419e0008163666f6f63626172f5 which corresponds to 57343_1([57344_1, ["foo"], "bar", true]).

I see cbor-x uses safeKey while decoding

function safeKey(key) {
    return key === '__proto__' ? '__proto_' : key // clever trick :)
}

but it seems to me there's something wrong in the encoding part (key is stripped, value is kept)

kriszyp commented 1 year ago

The decoded object I get is { foo: 'bar', isAdmin: true }. I agree this is kind of a a weird encoding/decoding, but it is also kind of a weird object.

atomictag commented 1 year ago

The decoded object I get is { foo: 'bar', isAdmin: true }. I agree this is kind of a a weird encoding/decoding, but it is also kind of a weird object.

I was using an encoder with records enabled which returns a different result - but yes, with the default encode / decode that's the result.

I read again the report https://github.com/advisories/GHSA-9c47-m6qq-7p4h and checked out what standard JSON does and TBH I am not sure whether returning { foo: 'bar', isAdmin: true } (or { foo: 'bar', undefined: true }) is the safest behavior. As always, thank you!

atomictag commented 1 year ago

In the Chrome console:

const record = { foo: "bar", __proto__: { isAdmin: true } };
console.log(record) // { foo: "bar" }
console.log(record.isAdmin) // true
console.log(record.__proto__) // { isAdmin : true }

However:

console.log(JSON.stringify(record))  // { foo: "bar" }

so I guess I would expect encode(record) to mimic what JSON.strigify serializes / deserializes

atomictag commented 7 months ago

FYI the behavior seem to have changed with commit e00ba2c and the encoded CBOR d9dfff8419e0008163666f6f63626172f5 = 57343_1([57344_1, ["foo"], "bar", true]) now fails to decode as the new safeKey behavior traps trying to call toString() on an undefined key.

I still think that the encoder should omit entirely the __proto__ part as JSON.stringify does instead of encoding a value with no key - but I guess this is a better behavior on the decoder part since the key is not explicitly encoded? (note that undefined appears to be a valid object key - at least on Chrome)

kriszyp commented 7 months ago

For context, we have observed DOS exploits that create very slow decoding by creating large arrays that have string conversions. These were leveraging referencing, but preventing any string coercion seemed like the best defense against this.

Are you asking for arrays to be converted to strings? (like if we can detect that they can be safely converted, I suppose). It certainly would not be too hard to allow null and undefined (note that JS just converts them to string keys). However, I have been hesitant to add property skipping, since it adds an extra step of overhead (performance penalty for the 99.99999% of execution for the sake of unusual types).

atomictag commented 7 months ago

Are you asking for arrays to be converted to strings?

The way I see it the decoding part is actually not the problem - it's an encoder (with record enabled) issue:

const record = { foo: "bar", __proto__: { isAdmin: true } };
const encoded = encode(record); // d9dfff8419e0008163666f6f63626172f5 => 57343_1([57344_1, ["foo"], "bar", true])
const decoded = decode(encoded); // traps because there is no key for value `true` 

The encoder should probably encode the record as

d9dfff8319e0008163666f6f63626172 => 57343_1([57344_1, ["foo"], "bar"])

given that

Object.keys(record) === ["foo"] // true

which is what JSON.stringify essentially does.

However, the encoder calls writeObject(value, true /* safePrototype */ ) which then loops through the keys

for (let key in object) if (safePrototype || object.hasOwnProperty(key)) { ... }

which causes "isAdmin" to be considered as a valid key. However later on keys is set to Object.keys(object) which returns ["foo"], hence the problem.

I guess that the library assumes that plain records where constructor === Object have always a safe prototype and therefore only "own" properties, which is not the case here

Many thanks

atomictag commented 5 months ago

I just gave a shot to v 1.5.9 and the behavior appears to be now correct

const record = { foo: "bar", __proto__: { isAdmin: true } };
// encoding without records enabled: b9000163666f6f63626172 => {"foo": "bar"}
// encoding with records enabled: d9dfff8319e0008163666f6f63626172 => 57343_1([57344_1, ["foo"], "bar"])

closing the issue with many thanks 🙏