aiq / basexx

A Lua library which provides base2(bitfield), base16(hex), base32(crockford/rfc), base64(rfc/url), base85(z85) decoding and encoding.
MIT License
85 stars 28 forks source link

Base64 decoding fails on Unicode surrogate pairs #2

Closed platinummonkey closed 8 years ago

platinummonkey commented 8 years ago

Base64 (and probably others) fail specifically from encoded strings that had some unicode surrogate pairs utilized. An example string:

-- lua
local basexx = require "basexx"
print(basexx.from_base64("eyJpc3MiOiJleGFtcGxlLmNvbSIsInN1YiI6InRlc3QiLCJleHAiOjAsImlhdCI6MCwiY29tcGFueSI6Ik15X0F3ZXNvbWUtQ29tcGFueSA_KDEpWzJdezN9ICYgU29ucyJ9="))
./basexx.lua:94: attempt to perform arithmetic on local 'index' (a nil value)
stack traceback:
    ./basexx.lua:94: in function <./basexx.lua:88>
    (tail call): ?
    stdin:1: in main chunk
    [C]: ?

Even though this has unicode surrogate pairs in them, I would still expect at least the following output:

# python
import base64
base64.b64decode("eyJpc3MiOiJleGFtcGxlLmNvbSIsInN1YiI6InRlc3QiLCJleHAiOjAsImlhdCI6MCwiY29tcGFueSI6Ik15X0F3ZXNvbWUtQ29tcGFueSA_KDEpWzJdezN9ICYgU29ucyJ9=")
{"iss":"example.com","sub":"test","exp":0,"iat":0,"company":"My_Awesome-Company \n\x0cJV\xcc\x97^\xcc\xdfH\t\x88\x14\xdb\xdb\x9c\xc8\x9f'
platinummonkey commented 8 years ago

nvm! ha, this is a url64 encoding scheme! (doh!)