Promote best practices / clearer boundaries / do less

KATT commented 3 years ago

I know it's called superjson, but I think it does a bit... much, and could easily cause some unintentional vulnerabilities & bugs.

I feel like things been thrown at this lib because it can be done, not because it ever should ever be done by any sane person. Have there actually been real-world use cases for everything supported?

It would a "safer" experience (considering it's a core part of blitz) if it would throw errors instead of trying to do everything.

Serializing/deserializing class is very cool, but it opens up vulnerabilities and exposing too much of the internals of your app
Allowing all numbers, even Not-A-Number, is cool, but isn't it better to throw errors on when isNaN / Infinity / -Infinity. And to be OK with things like -0 being represented a JSON 0 #83
Serializing RegExp is cool - but why would you wanna do this?
Serializing errors are cool, and I see how being able to do instanceof could be neat when doing full-stack stuff in blitz, but automatically exposing stack is probably not desirable in most cases and exposes app internals to potential attackers.
It allows custom errors, but serializing and returning a custom error's props in plain text isn't a good idea as it might expose underlying things
Serializing undefined - is that really needed?
Serializing symbols - is that really needed?

Just thoughts... reducing the API-surface could decrease the risk of attacks in blitz apps.

Skn0tt commented 3 years ago

Thanks for opening the discussion! It's highly welcome. Some of my own thoughts following:

The core aim of SuperJSON is to enable RPC calls without surprising developers (see here). For Blitz, it's particularly important to maintain type-soundness - what's sent off as a Map should arrive as a Map, since VS Code tells you it's a Map!
Sure, transferring a RegExp isn't particularly common - but if we threw an error, we might as well go back to using JSON.stringify.
Out of the points you mention, this justifies support for RegExp and undefined. NaN, -0, Infinity and -Infinity also go by "don't surprise developers".

I feel like things been thrown at this lib because it can be done

You're absolutely right. Class serialisation and Symbol serialisation in particular were built because I got a bit excited chasing JavaScript value parity 😅 I don't know wether they're heavily used, either.

However, I'm struggling to imagine an attack vector that exploits this. Could you outline one?

automatically exposing stack exposes app internals to potential attackers.

Definitely, we should do something about that. What'd be your idea to combat this?

And to be OK with things like -0 being represented a JSON 0 #83

Tbh, I just wasn't aware of the existence of -0. If I knew of it before, I'd have accounted for that ;D

KATT commented 3 years ago

Thanks for responding! Glad to have a discussion about it.

what's sent off as a Map should arrive as a Map, since VS Code tells you it's a Map!

I agree, this is great! Map/Date/Set are great to be able to send. Especially Date 🤩 - if this lib only took care Date it would still be worth using.

but if we threw an error, we might as well go back to using JSON.stringify.

I strongly disagree on this point.

Errors are great and easy to fix as you actually see them. The faster the app breaks, the better. False-positive results create tiny invisible bugs that are hard to catch and are often found much later.

Allowing things like NaN rather than throwing errors will make people write API-code that they think are working, passing it over the wire and it'll look like it working until someone clicks something and it doesn't or when Stripe responds with an error as you're trying to send them a NaN (which serializes as null (it's not great that JSON.stringify() allows NaN without throwing).

For example, can you justify 1 single case when you'd want to send NaN over the wire when it wasn't in fact a hidden bug on the backend/frontend? Same question on Infinity.

Currently, SuperJSON is aiming for feature-parity with JS values.

I think that shouldn't be a goal in itself. JS has a lot of bad parts and if you write a framework like blitz you should try to do your best to abstract those away.

You're absolutely right. Class serialisation and Symbol serialisation in particular were built because I got a bit excited chasing JavaScript value parity 😅

If that's the case I think you should remove them until someone asks for them! :) all features you add you'll have to maintain in eternity.

However, I'm struggling to imagine an attack vector that exploits this. Could you outline one?

Monday: Developer Adam adds a class that would be neat to serialize
It's serialized and passed over the wire. Great!
Weds: Developer Barney is using the same class somewhere else. He doesn't know this is serialized and passed over the wire (as this is something very rare)
Developer Barney adds some sensitive user data / API-key / something as a property to the class
Thurs: Developer Courtney reviews Barney, doesn't know about Adam's addition, accepts the code & deploys

I see the case for Errors, but then have a registerError rather than registerClass - and it should be used with great caution as errors tend to contain a lot of detail about internals, so like exposing "stack" is probably not something you'd want to do.

KATT commented 3 years ago

As a reference, Next.js gives great error messages when you try to serialize dates and stuff right now. Reference: https://github.com/vercel/next.js/blob/canary/packages/next/lib/is-serializable-props.ts

I'm really mostly hung up on the numbers - it'd be good to disallow some bad practices like NaN/Infinity.

Skn0tt commented 3 years ago

Developer Barney adds some sensitive user data / API-key / something as a property to the class

This is not SuperJSON-specific. Properties would have been picked up by JSON.stringify, too.

Skn0tt commented 3 years ago

@flybayer curious what your opinion is on this :D

MrLeebo commented 3 years ago

I agree that classes and regexes aren't ideal for serialization. Seems like it would be better to limit classes to the standard library by default. Serializing regexes is also problematic because it opens up the possibility for regex-based denial of service attacks.

I disagree with treating NaN as an error value, though. I think NaN should have been in the JSON spec and the only reasons it wasn't included are technical issues, not procedural. That makes it ideal for a library like superjson to "fix"

Skn0tt commented 3 years ago

Serializing regexes is also problematic because it opens up the possibility for regex-based denial of service attacks.

Do you feel like we should forbid this entirely? Or should we present a hide-able warning instead? (kind of like React Native does it with the YellowBox)

MrLeebo commented 3 years ago

Yeah, I think they should be forbidden. Can't think of any reason why you'd want your server to execute some arbitrary code given to it by the client.

KATT commented 3 years ago

If it was up to me me, I would strip the lib down to the bare minimum and throw on any types I didn't want to deal with by throwing errors similar to this function in next.js. This would make superjson safer than normal JSON and that'd be pretty super as well.

When approaching

a class instance (if that's something you'd decide not to support / an unregistered class) you can throw an error rather than serializing it (JSON.stringify() stringifies this, but I think it should just throw an error)
a NaN -> throw error instead of serializing (JSON.stringify(NaN) -> null which is bad)
[..]

Would probably make the lib faster too (#68)

flybayer commented 3 years ago

Good discussion y'all.

One solution would be to add a strict mode, and probably enable strict mode by default. Strict mode could turn off classes and regex.

We definitely need to keep error serialization, but for sure we can sanitize them. In Blitz we are already deleting the stack property on error objects.

Another thing to consider here is that superjson could be used for secure, server to server or intranetwork communication. So just because something might not be ideal for an API endpoint doesn't mean we should remove it entirely.

And we definitely should keep undefined, NaN, Infinity, etc. There can be good uses for those and they are harmless and fast, so we shouldn't get in the way and enforce our ideal coding style on everyone.

tmcw commented 1 year ago

One data point here; I'm starting to use superjson as a way for arbitrary JavaScript functions to communicate with each other and produce APIs over HTTP. So the "super" part is pretty useful, even the support for RegExp. As far as I've been able to tell, there isn't a fully general serialization that can serialize any JavaScript value, so there are a bunch of serializers (made a chart here) and superjson is one of the best of 'em.

Mine might be a fringe usecase! But anyway, if superjson gets slimmer, I'd probably reimplement the removed datatypes with custom, if that ends up being possible. I'm already thinking about implementing a sort of extension, batteries-included module that encodes many more types.

flightcontrolhq / superjson

Promote best practices / clearer boundaries / do less #85