zanaptak / BinaryToTextEncoding

A binary-to-text encoder/decoder library for .NET and Fable. Provides base 16, base 32, base 46, base 64, and base 91 codecs. Supports custom character sets.
MIT License
9 stars 0 forks source link
base16 base32 base46 base64 base91 dotnet fable

Zanaptak.BinaryToTextEncoding

GitHub NuGet

A binary-to-text encoder/decoder library for .NET and Fable. Provides base 16, base 32, base 46, base 64, and base 91 codecs. Supports custom character sets.

Output example

Example of a random 16-byte array (same size as a GUID) encoded in each base:

Encoded bits per character

The base values in this library have been chosen because they can encode an integral number of bits as either 1 or 2 characters, making the conversion relatively efficient since groups of bits can be directly converted using lookup arrays.

Usage

Add the NuGet package to your project:

dotnet add package Zanaptak.BinaryToTextEncoding

C

using Zanaptak.BinaryToTextEncoding;

// Default codec
var originalBytes = new byte[] { 1, 2, 3 };
var encodedString = Base32.Default.Encode(originalBytes);
var decodedBytes = Base32.Default.Decode(encodedString);

// Custom character set
var customBase32 = new Base32("BCDFHJKMNPQRSTXZbcdfhjkmnpqrstxz");
var customOriginalBytes = new byte[] { 4, 5, 6 };
var customEncodedString = customBase32.Encode(customOriginalBytes);
var customDecodedBytes = customBase32.Decode(customEncodedString);

// Wrap output
var randomBytes = new byte[100];
new System.Random(12345).NextBytes(randomBytes);
Console.WriteLine(Base91.Default.Encode(randomBytes, 48));
//  Output:
//  r]g^oP{ZKd1>}lC{C*P){O96SL8z%0TW,4BfEof}%!b@a#:6
//  nN<c#=}80|srYHUy6$XP}4x945a~,ItFPS;U%a^<DMA]@m|#
//  12tC]*5+BoT-4Th,oVR9wvIv;Iym

F

open Zanaptak.BinaryToTextEncoding

// Default codec
let originalBytes = [| 1uy; 2uy; 3uy |]
let encodedString = Base32.Default.Encode originalBytes
let decodedBytes = Base32.Default.Decode encodedString

// Custom character set
let customBase32 = Base32("BCDFHJKMNPQRSTXZbcdfhjkmnpqrstxz")
let customOriginalBytes = [| 4uy; 5uy; 6uy |]
let customEncodedString = customBase32.Encode customOriginalBytes
let customDecodedBytes = customBase32.Decode customEncodedString

// Wrap output
let randomBytes = Array.create 100 0uy
System.Random(12345).NextBytes(randomBytes)
printfn "%s" (Base91.Default.Encode(randomBytes, 48))
//  Output:
//  r]g^oP{ZKd1>}lC{C*P){O96SL8z%0TW,4BfEof}%!b@a#:6
//  nN<c#=}80|srYHUy6$XP}4x945a~,ItFPS;U%a^<DMA]@m|#
//  12tC]*5+BoT-4Th,oVR9wvIv;Iym

Notes

Built-in character sets

Base16 Description Characters
StandardCharacterSet (Default) Standard hexadecimal notation, ASCII-sortable 0123456789ABCDEF
ConsonantsCharacterSet Excludes numbers, vowels, and some confusable letters, ASCII-sortable BCDFHJKMNPQRSTXZ
Base32 Description Characters
StandardCharacterSet (Default) RFC 4648 section 6 ABCDEFGHIJKLMNOPQRSTUVWXYZ234567
HexExtendedCharacterSet RFC 4648 section 7, ASCII-sortable 0123456789ABCDEFGHIJKLMNOPQRSTUV
ConsonantsCharacterSet Excludes numbers, vowels, and some confusable letters, ASCII-sortable BCDFHJKMNPQRSTXZbcdfhjkmnpqrstxz
Base46 Description Characters
SortableCharacterSet (Default) Excludes vowels and some confusable characters, ASCII-sortable 234567BCDFGHJKMNPQRSTVW
XYZbcdfghjkmnpqrstvwxyz
LettersCharacterSet Excludes numbers and some confusable letters, ASCII-sortable ABCDEFGHJKMNPQRSTUVWXYZ
abcdefghjkmnpqrstuvwxyz
Base64 Description Characters
StandardCharacterSet (Default) RFC 4648 section 4 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
ghijklmnopqrstuvwxyz0123456789+/
UrlSafeCharacterSet RFC 4648 section 5 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
ghijklmnopqrstuvwxyz0123456789-_
UnixCryptCharacterSet Unix crypt password hashes, ASCII-sortable ./0123456789ABCDEFGHIJKLMNOPQRST
UVWXYZabcdefghijklmnopqrstuvwxyz
Base91 Description Characters
SortableQuotableCharacterSet (Default) Excludes " ' \ characters, ASCII-sortable !#$%&()*+,-./0123456789:;<=>?@A
BCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`a
bcdefghijklmnopqrstuvwxyz{\|}~
Base91Legacy Description Characters
LegacyCharacterSet (Default) Original 'basE91' character set ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
ghijklmnopqrstuvwxyz0123456789!#
$%&()*+,./:;<=>?@[]^_`{\|}~"

Legacy 'basE91' compatibility

This library provides two base 91 implementations: Base91 and Base91Legacy. They are not compatible; the encoded output of one cannot be decoded by the other.

The main Base91 algorithm works like the other BaseXX algorithms in the library. It encodes with constant-width (each 2-character pair encodes exactly 13 bits) in big-endian order (most-significant character fist, representing the most-significant bits of the most-significant byte). The default character set is in ASCII order to preserve sortability of input, and excludes the characters ", ', and \ to make it more easily quotable in programming languages.

Base91Legacy is based on the previously existing basE91 algorithm. It encodes with a variable-width mechanism (some 2-character pairs can encode 14 bits instead of 13) which can result in slightly smaller encoded strings. Each two-character pair in the output is swapped compared to the main algorithm (least-significant char of the pair first), so sorting by string is not meaningful regardless of character set. Its default character set includes the " character, making it inconvenient to use in some programming languages and data formats such as JSON.

Benchmarks

See the benchmark project.