eshaz / simple-yenc

Minimalist JavaScript binary string encoder / decoder with 1-2% overhead, compared to 33%-40% overhead for 6-bit encoding methods like Base64.
MIT License
17 stars 3 forks source link

deal with html overrides #2

Closed eyaler closed 2 years ago

eyaler commented 2 years ago

account for html char overrides to allow embedding stringified js template literals in html. see https://stackoverflow.com/a/10081375/664456 , https://https://html.spec.whatwg.org/multipage/parsing.html#table-charref-overrides

hi - i ran your test suite and it passes, however i am not sure how to add new test for the problem this patch solves. this is the first time i am running node.js and mocha so i am not sure about this, but afaiu we would need to run some browser emulation to test that the html override is resolved correctly. what do you think?

eyaler commented 2 years ago

thanks! to alleviate your concerns i can try to explore the option of not using a conversion table, but actually have it created dynamically by the browser for the codes 128-159.

eshaz commented 2 years ago

@eyaler That would be interesting if you can figure that out, but the conversion table should be fine. Reading the HTML standard document, it seemed like this will not change for other single byte encodings, for HTML at least.

Also, I released this change in version 0.1.0. I used a Map rather than an array to select the index for better performance and also added a test.

eyaler commented 2 years ago

@eshaz thanks! i see you shifted by 1 and added to 127. was there an issue with 128 or is this just your preference? also if you are using map you can add 65533 which is the override for nul. i know that nul is not allowed, but if you are interested i can open an issue/pr to optionally allow nul and LF, which i am using in my stringified use case. so Y-ENCode? :) the only thing that needs escaping is CR.

eshaz commented 2 years ago

i see you shifted by 1 and added to 127. was there an issue with 128 or is this just your preference?

I did that so the if statement could check for truthyness rather than undefined explicitly when determining when to add the value from the map or use the actual character code.

also if you are using map you can add 65533 which is the override for nul. i know that nul is not allowed, but if you are interested i can open an issue/pr to optionally allow nul and LF, which i am using in my stringified use case. so Y-ENCode? :) the only thing that needs escaping is CR.

Another PR is always welcome. I would ask that the existing test cases pass though before it can be merged. Technically, not escaping NULL and LF would break compatibility with any yEnc strings encoded with another tool assuming they follow the standard noted here: http://www.yenc.org/general.htm

eyaler commented 2 years ago

@eshaz If interested, I ended up implementing my ideas here: https://github.com/eyaler/ztml