near / borsh-js

TypeScript/JavaScript implementation of Binary Object Representation Serializer for Hashing
Apache License 2.0
112 stars 38 forks source link

The writer now accepts `bigint` and the reader now always returns `bigint` for numbers 64 bits and larger #60

Closed steveluscher closed 10 months ago

steveluscher commented 2 years ago

Problem statement

Certain applications target runtimes that support JavaScript's BigInt. If such an application takes a dependency on borsh-js, it's nevertheless forced to bundle the bn.js polyfill.

Consider a library like @solana/web3.js. bn.js makes up ~13% of its bundle size (a whopping 35K):

image

If such a library could install a version of borsh-js that dealt in native BigInt values, then it could eliminate its dependency on bn.js and reduce its bundle size.

Summary of changes

Test plan

yarn test

Notes

This represents a breaking change, since deserialized numbers 64 bits or larger will now be of type bigint when they were previously of type BN.

BigInt is supported by every major browser released since September 15 2020: https://caniuse.com/bigint

macalinao commented 2 years ago

Is there a way we can use JSBI instead? Many browsers don't support BigInt yet, and there is no polyfill available for it.

paulmillr commented 2 years ago

@macalinao

Many browsers don't support BigInt yet

That's bullshit.

steveluscher commented 2 years ago

BigInt support:

steveluscher commented 2 years ago
macalinao commented 2 years ago

BigInt support:

  • Browsers, by release date:

    • Chrome: May 2018

    • Firefox: July 2019

    • Safari: September 2020

    • Mobile Safari: September 2020

    • Edge: January 2020

    • Opera: June 2018

    • Samsung Internet: April 2019

  • Runtimes, by version:

    • Deno: 1.0

    • Node: 10.4.0

We used to have a decent amount of UC browser users:

image

But I believe the app has gotten banned in India so it might not be an issue anymore.

ailisp commented 2 years ago

@steveluscher Thanks again for your PR. I talked with the team lead of near-api-js (roughly equivalent to solana web3), unfortunately we cannot drop bn in borsh. Their points are:

That's said, we can't release a bigint-version of borsh-js now. But my team (https://github.com/near/near-sdk-js) is also interested in a no-bn, no-buffer, esm-packaged version of borsh-js. If you're going to make such a release, we'll happy to use yours. Otherwise, our plan to make such a release is probably in two months.

paulmillr commented 2 years ago

Hey folks, it's clear NEAR doesn't care about security, or bundle size. I suggest switching to a fork or an alternative solution such as https://github.com/paulmillr/micro-packed, and making the issue publicly known

support pre-2020 browser is important

Caniuse says bigints are supported in 93.71% of all browsers, which means you're considering "important" 6% of users who most likely don't use NEAR. Most of these users with old browsers are in environments such as old Android or Edge.

Also it's not true "pre-2020" browsers don't support bigints. As told in previous post, Chrome supported them since May 2018 — which was well over 4 years ago.

There is also no need to force all users to upgrade to new Borsh. Old apps can keep using old borsh as-is, there would be no disruption for them.

usually they adopt new features and polyfill for the old, but BigInt is not polyfill-able

That doesn't matter for Ethereum, Polkadot, Bitcoin, or Solana communities who've adopted new feature and care about bundle size / security.

the value of saving 30kb of code in dapp bundles (frontend) isn't big

Saving EVERY byte is important. Every line of code can contain malware. Every package could get broken by hackers. Less packages = less hacks. There are users in developing countries with very slow connections. NEAR essentially doesn't care about them.

Auditing is also non-trivial. Auditing every new line of code increases unneeded expenses for every security-concious .js app.


Another alternative to forking is Packed (https://github.com/paulmillr/micro-packed). It is better:

Packed is able to parse any sort of binary stuff. For example, in package ed25519-keygen, it is used to implement PGP/SSH formats; even though PGP is complex! Not even talking about BTC arrays. Borsh is very generic and basic.

macalinao commented 2 years ago

Hey folks, it's clear NEAR doesn't care about security, or bundle size. I suggest switching to a fork or an alternative solution such as https://github.com/paulmillr/micro-packed, and making the issue publicly known

support pre-2020 browser is important

Caniuse says bigints are supported in 93.71% of all browsers, which means you're considering "important" 6% of users who most likely don't use NEAR. Most of these users with old browsers are in environments such as old Android or Edge.

Also it's not true "pre-2020" browsers don't support bigints. As told in previous post, Chrome supported them since May 2018 — which was well over 4 years ago.

There is also no need to force all users to upgrade to new Borsh. Old apps can keep using old borsh as-is, there would be no disruption for them.

usually they adopt new features and polyfill for the old, but BigInt is not polyfill-able

That doesn't matter for Ethereum, Polkadot, Bitcoin, or Solana communities who've adopted new feature and care about bundle size / security.

the value of saving 30kb of code in dapp bundles (frontend) isn't big

Saving EVERY byte is important. Every line of code can contain malware. Every package could get broken by hackers. Less packages = less hacks. There are users in developing countries with very slow connections. NEAR essentially doesn't care about them.

Auditing is also non-trivial. Auditing every new line of code increases unneeded expenses for every security-concious .js app.


Another alternative to forking is Packed (https://github.com/paulmillr/micro-packed). It is better:

  • Bigints instead of bnjs

  • Many more correctness / safety checks

  • Encoder/decoder are combined into one, which makes it composable and an order of magnitude simpler to audit

  • Nice syntax: P.struct({x: P.U8, y: P.U64LE, z: P.string(…), q: P.array(…) }); versus const schema = new Map([[Test, { kind: 'struct', fields: [['x', 'u8'], ['y', 'u64'], ['z', 'string'], ['q', [3]]] }]]);

  • Has i32/i64le/be, strings support endianness instead of just P.string(P.U32LE)

Packed is able to parse any sort of binary stuff. For example, in package ed25519-keygen, it is used to implement PGP/SSH formats; even though PGP is complex! Not even talking about BTC arrays. Borsh is very generic and basic.

What's wrong with the JSBI Babel transform? Does it not fix bundle size?

More info: https://github.com/GoogleChromeLabs/babel-plugin-transform-jsbi-to-bigint

you're considering "important" 6% of users who most likely don't use NEAR

That 6% is in the developing world (usually Asia), which is what I believe NEAR targets (and what used to be a non trivial part of Solana's userbase). These people often don't have $1k to buy an iphone or new Android phone, but they can afford much cheaper devices.

There is also no need to force all users to upgrade to new Borsh. Old apps can keep using old borsh as-is, there would be no disruption for them.

Unfortunately it's not this simple-- updated apps will be forced to use later versions of Borsh.js as Solana and Near continue to push updates that require newer SDK versions.

paulmillr commented 2 years ago

That 6% is in the developing world (usually Asia), which is what I believe NEAR targets (and what used to be a non trivial part of Solana's userbase). These people often don't have $1k to buy an iphone or new Android phone, but they can afford much cheaper devices.

New samsung that supports all the good stuff costs $125. It's not $1000. And there are cheaper brands, so it would be less than that. Chrome is evergreen, and is getting updated even on old devices.

What's wrong with the JSBI Babel transform? Does it not fix bundle size?

  1. Complexity: you need to tell users of the library to "transpile" it
  2. The need to support both cases, handle obscure bugs that happen in native bigints and don't happen in jsbi / wise-versa
  3. Security — "JSBI babel transform" can inject malware in your code unless you audit it thoroughly and re-audit with every update which is not realistic.

Unfortunately it's not this simple-- updated apps will be forced to use later versions of Borsh.js as Solana and Near continue to push updates that require newer SDK versions.

It is simple with ETH, SOL, Polkadot, BTC. With NEAR it's not simple? Those are just excuses to prevent security upgrades.

marcus-pousette commented 2 years ago

@steveluscher Thanks again for your PR. I talked with the team lead of near-api-js (roughly equivalent to solana web3), unfortunately we cannot drop bn in borsh. Their points are:

  • support pre-2020 browser is important
  • usually they adopt new features and polyfill for the old, but BigInt is not polyfill-able
  • the value of saving 30kb of code in dapp bundles (frontend) isn't big

That's said, we can't release a bigint-version of borsh-js now. But my team (https://github.com/near/near-sdk-js) is also interested in a no-bn, no-buffer, esm-packaged version of borsh-js. If you're going to make such a release, we'll happy to use yours. Otherwise, our plan to make such a release is probably in two months.

@ailisp Check this implementation: borsh-ts. I released an update a moment ago removed the dependency on Buffer, so now it is no-bn, no-buffer and esm-packaged (and cjs).

ailisp commented 2 years ago

@marcus-pousette That's a great job! We can probably use your library in near-sdk-js. cc @volovyks

steveluscher commented 2 years ago

That's said, we can't release a bigint-version of borsh-js now.

No problem! We'll fork it. We have to, to continue to get bundle size under control.

That's a great job! We can probably use your library in near-sdk-js.

I really want to use borsh-ts but I don't think we can. Our public API is already based on borsh-js, meaning that third-parties – in the wild – presume that they can add borsh-js compatible schemas to a global store via SOLANA_SCHEMA.set(). I'd have to build a glue layer between the borsh-js schema and the borsh-ts schema.

One thing that I like about borsh-ts is that there is no central store for schemas; each object owns its own schema. If an object is not present in your app, you don't have to pay the cost of storing its schema.

One thing that I don't like about borsh-ts is that each object contains its serialization schema, whether or not your app ever serializes/deserializes it. If I only use PublicKey in my app but never Borsh serialize/deserialize it, then I've paid for the schema construction/storage for nothing. cc/ @marcus-pousette

It's a tricky balance. On the one hand you get an API like serialize(object) but the Object needs to schlep the schema around with it wherever it goes whether or not you require serialization capabilities, and on the other hand you get an API like serialize(schema, object) where you have to store the schema for every possible Object, whether your app encounters one of those objects or not.

marcus-pousette commented 2 years ago

@steveluscher Yes I see. I spent some time experimenting with a glue layer: It is not trivial to create since it has to support compatibility in both directions in practice. There are some features in borsh-ts that can not easily be translated into borsh-js schema, like the custom/functional field types and that both libraries have different compatibility with the borsh specification. Hence, it is a good idea if you stick with (your fork of) borsh-js if you still want to export schemas (though it might nevertheless still be a breaking change you change big number representation to bigint).

It's a tricky balance. On the one hand you get an API like serialize(object) but the Object needs to schlep the schema around with it wherever it goes whether or not you require serialization capabilities, and on the other hand you get an API like serialize(schema, object) where you have to store the schema for every possible Object, whether your app encounters one of those objects or not.

I spent some time thinking about that problem when working the project. My choice did boil down to that my main goal with borsh-ts was to make it as easy as possible to work with borsh in TypeScript (especially on large project with multiple libraries). I also like how borsh-rs feels, and wanted the experience to be as close as one can get it from a TypeScript setting. I guess some portability/separation of concerns was lost on the way due to this, but it has been worth it so far.

paulmillr commented 2 years ago

@steveluscher what's the plan here? Are you going to release the fork to npm any time soon?

steveluscher commented 2 years ago

This is a stretch goal for me, this month. Likely won't ship this until after Breakpoint. Current priorities: Breakpoint & the brand new Solana Mobile Wallet adapter.

gagdiez commented 10 months ago

the new version uses bigints and do not depend on BN.js