EOSIO / eosjs

General purpose library for the EOSIO blockchain.
http://eosio.github.io/eosjs
MIT License
1.43k stars 463 forks source link

[question] text-encoding needed other than for utf-8? #580

Closed mvayngrib closed 5 years ago

mvayngrib commented 5 years ago

i see the defaults are initialized with utf-8 in mind, and the example in the docs does it similarly. Is support for non-utf-8 encodings needed? If this is specified in the docs, a link would be really appreciated. Thanks!

based on the git blame, maybe @tbfleming can help :)

tbfleming commented 5 years ago

I recommend against other encodings.

mvayngrib commented 5 years ago

@tbfleming thanks for the quick response! does that mean that there isn't actually a hard restriction and/or validation for utf8, and that other eos clients may use other encodings, which will break a utf-8 only TextDecoder? Do you know how the other implementations deal with this (eos-go, eos-...)?

mvayngrib commented 5 years ago

the reason i ask is text-encoding is a 500kb-heavy dep (!), a huge chunk of our React Native bundle. If all of those encodings aren't actually needed, we can inject a smaller Buffer.from/buf.toString() based polyfill. However, I want to be 100% sure that those encodings aren't actually needed...

edit: Buffer.from/buf.toString() based polyfill for utf-8-only support

tbfleming commented 5 years ago

There's a lot of utf-8 data on chain. e.g. transfer memo fields have content in a wide variety of languages

mvayngrib commented 5 years ago

right, the question is, if there's one with non-utf-8 data (is this allowed?), will it break the client?

mvayngrib commented 5 years ago

some refs from the cpp implementation (i'm not sure how to read these): https://github.com/EOSIO/eos/blob/master/libraries/chain/abi_serializer.cpp#L91 https://github.com/EOSIO/fc/blob/master/include/fc/io/raw.hpp

tbfleming commented 5 years ago

Nodeos, cleos, and most contracts are agnostic towards encoding. To them, a string is just a sequence of 8-bit bytes. Since both Linux and OSX use utf-8 as the default code page, cleos gets utf-8 support for free. Javascript has a more complicated story since its string type holds a variant of utf-16.

tbfleming commented 5 years ago

If you use a conversion which doesn't do utf-8, then non-ascii characters will get mangled, since others use utf-8 encoding.

mvayngrib commented 5 years ago

@tbfleming thanks!