aklomp / base64

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
BSD 2-Clause "Simplified" License
866 stars 162 forks source link

Base64url support? #66

Open jrch2k20 opened 4 years ago

jrch2k20 commented 4 years ago

Hi, aklomp i was searching for a nice base64 library and ended here but my low level C is a bit rusty, so i would like to ask you if this library could be used to handle base64URL variant? or at least some tips on how to add it?(i would post patches back if i made it)

Thanks you very much for your time

aklomp commented 4 years ago

Hi! Currently this library only supports the standard Base64 alphabet.

It would probably be feasible, but difficult, to add support for additional alphabets. It could be non-trivial to add that support to the SIMD codecs, because they do arithmetic on the raw character values. Here's an encoder example and here's a decoder.

My intuition is that it would probably be possible to find an arithmetic-based solution that works with the alternative alphabet since the differences are small. However, it would require duplicating a lot of code, and adding some sort of user-visible flag to the API to indicate which alphabet to use. Maybe it could be a compile-time flag to not incur runtime penalties or complexities.

gfoidl commented 4 years ago

since the differences are small. However, it would require duplicating a lot of code

Yeah, it's possible. I've done this for C#, but as @aklomp says there's a lot of duplication and the nice tricks applied to standard base64 don't work so nice with base64Url. Especially on the decoding side for input-validation.

jrch2k20 commented 4 years ago

Thank you very much for your time.

Yeah, i see your point and probably will be easier to handle base64url to base64 translation externally in the c++17 side of thing since the chunk are small and this library already give me a nice speed up so i have some wiggle room.

mayeut commented 4 years ago

There's an example of translation in https://github.com/mayeut/pybase64/blob/1e2f3ec63549085f06b3118671818edb969c1e3d/pybase64/_pybase64.c#L71

The translation is done in-place for encoding. The translation is done out-of-place for decoding (warning, the translation is not safe here to mimic python behavior, c.f. inline comment)

jrch2k20 commented 4 years ago

well for now a very simple std implementation seems to do the job with very small assembly output

https://godbolt.org/z/8J2zKG

ashundi-tibco commented 3 years ago

What is being asked is to just accept base64url format when decoding. Replacing characters 62 and 63 is best done at https://github.com/aklomp/base64/blob/master/lib/tables/tables.c Supporting encoding would require more work...