tuupola / base62

Base62 encoder and decoder for arbitrary data
MIT License
193 stars 19 forks source link

Which base62 algorithm does this implement? #29

Open coolaj86 opened 2 years ago

coolaj86 commented 2 years ago

My understanding is that there is no formal spec for base62, but that the "glowfall" implementation (despite its lack of stars) has become the de facto implementation (used the most across the most repos).

Does this follow that spec? Or a different one? Or create a new one?

tuupola commented 2 years ago

Have not heard of glowfall before. This library implements mathematical byte by byte base conversion of arbitrary data. There is really only one way to do it. Standards such as base85 which has extra rules for compressing spaces etc are not pure base conversions.

By default this library uses 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz as character set. It is the same as The GNU Multiple Precision Arithmetic Library uses. GMP has been around since early 1990's. I personally like this character set most since the encoded base62 strings preserve the sort order of the encoded values

coolaj86 commented 2 years ago

I think I was mistaken. The GMP, GnuPG, and Saltpack implementations seem to be the most recognized.

Would you mind giving example output for your implementation compared to these references?

Raw   : "Hello, 世界"
Base64: SGVsbG8sIOS4lueVjA (18 chars)
Base62: 1wJfrzvdbuFbL65vcS (18 chars)

Raw   : "Hello World"
Base64: SGVsbG8gV29ybGQ (15 chars)
Base62: 73XpUgyMwkGr29M (15 chars)

Raw   : [ 0, 0, 0, 0, 255, 255, 255, 255 ]
Base64: AAAAAP____8 (11 chars)
Base62: 000004gfFC3 (11 chars)

Raw   : [ 255, 255, 255, 255, 0, 0, 0, 0 ]
Base64: _____wAAAAA (11 chars)
Base62: LygHZwPV2MC (11 chars)