haskell-foundation / foundation

Empire strikes back
Other
464 stars 91 forks source link

Should base64 encoding be in the String module? #387

Open tekul opened 7 years ago

tekul commented 7 years ago

The current APIs are of type String -> String, but it seems that in most cases base64 encoding would be used on primitive byte arrays rather that UTF-8 encoded strings. I would expect something like

toBase64 :: UArray Word8 -> String
fromBase64 :: String -> Either SomeError (UArray Word8)

Also, I'm not sure exposing the URL, OpenBSD versions are necessary there as these are specific to crypto use-cases. I'm just looking at writing slightly higher-level code for Argon2, which is yet another encoding version (standard table, but unpadded). So before I look at porting the code from memory it'd be nice to clear up what the APIs should look like.

I think what I'd like to have is something like toBase64Internal but with the scary #Addr argument hidden behind some Base64Scheme type and a similar decoding function going the other way. All combinations would then be available to anyone who really wanted to use them, and we could get rid of functions like toBase64UrlUnpadded, and just retain toBase64, fromBase64 for the most common standard case.

vincenthz commented 7 years ago

just a quick comment until I come back: it's not a simple thing type/API wise, it's quite overloaded; same for Base16/Hexadecimal. One things that'ld be nice to do would be using the AsciiString type. this type should be able to convert to String or UArray cheaply and without having the user to jump through a conversion that it not suppose to fail (because Base64 or Base16 is ASCII/UTF8) but could API wise (generally bytearray to string could fail).

vincenthz commented 6 years ago

Sorry, forgot to come back here. I struggle to find a good way to expose this, but having it strictly in the String module is not ideal; clearly base64 works on String because a String has a byte representation.

It would be nice to find a way to express that base64, at least for the encoding, preserve the ASCII'ness property which make it friendly to String.