pjkundert / ezpwd-reed-solomon

Reed-Solomon & BCH encoding and decoding, in C++, Javascript & Python
https://hardconsulting.com/products/13-reed-solomon
Other
99 stars 21 forks source link

-- coding: utf-8 --

+TITLE: EZPWD Reed-Solomon

+EXPORT_FILE_NAME: README.pdf

+STARTUP: org-startup-with-inline-images inlineimages

+OPTIONS: ^:nil # Disable sub/superscripting with bare ; {...} still works

+OPTIONS: toc:nil

+LATEX_HEADER: \usepackage[margin=1.0in]{geometry}

+BEGIN_ABSTRACT

Implements C++ classes for compile-time optimized Reed-Solomon and BCH codecs.

+END_ABSTRACT

+TOC: headlines 3

** Installing

EZPWD Reed-Solomon has no compile-time dependencies, other than the basic C library.

*** Nix Support

Installation via Nix is supported via:
: nix-build -A ezpwd-reed-solomon

** Performance

The Reed-Solomon codec's performance is often a critical determinant in product performance, and often influences expensive decisions such as CPU selection for embedded applications.

The =ezpwd::RS<...>= codec corrects data in-place to avoid unnecessary copying, using shared data tables between RS instances with compatible size, parity and Galois field parameters. The use of C++ template parameters for size-related Reed Solomon parameters enables the compiler to emit optimal code for all internal table lookups, sometimes dramatically improving performance.

*** 25-40% Faster Than Competing R-S Codecs

EZPWD Reed-Solomon performs R-S correction operations (using g++ on an Intel
i7) about 40% faster faster than Phil Karn's general case code, and about
25% faster than Phil's optimized code for 8-bit/CCSDS symbols.  It also
tests at about 25% faster than the Shifra C++ Reed-Solomon library.  It
appears that, for certain applications at least, EZPWD Reed-Solomon may be
the fastest Open Source general-purpose C++ Reed-Solomon library presently
available.

** Availability

Source code is hosted at [[https://github.com/pjkundert/ezpwd-reed-solomon]]. It is heavily tested under both g++ (version 4.8 to 5.1) and clang++, with full optimization enabled. Get details at [[http://hardconsulting.com/products/13-reed-solomon][Hard Consulting Corporation (hardconsulting.com/products)]], and learn about competing geolocation encoding and why [[http://hardconsulting.com/research/15-geolocation-encoding][EZCOD Geolocation Encoding]] (hardconsulting.com/research)]] is necessary.

*** C++ Build Requires Header Source Only

Carefully implemented as C++11 standard inline code, to require no C++ object-code or library
build procedure.  As a result, integration into existing C++ code-bases and build systems is
extremely simple.

Checkout =https://github.com/pjkundert/ezpwd-reed-solomon.git=, and add the following to your
C++ build to gain access to the various =#include <ezpwd/...>= include files in your C++ code:
: CXXFLAGS += -I ezpwd-reed-solomon/c++

If you plan to =#include <ezpwd/bch>=, also add:
: CXXFLAGS += -I ezpwd-reed-solomon/c++/ezpwd/bch_include
and for non-Linux systems where linux/errno.h is unavailable, also add:
: CXXFLAGS += -I ezpwd-reed-solomon/c++/ezpwd/bch_errnums

| Module       | Include       | Description                | Requires                        |
|--------------+---------------+----------------------------+---------------------------------|
| Reed-Solomon | <ezpwd/rs>    | Reed-Solomon codec         | -I c++                          |
| BCH          | <ezpwd/bch>   | Using BCH codec            | -I c++ -I c++/ezpwd/bch_include |
|              |               | on non-Linux systems, add: | -I c++/ezpwd/bch_errnums        |
| EZCOD        | <ezpwd/ezcod> | EZCOD geolocation codec    | -I c++                          |

*** Javascript Hosted by MaxCDN

Production minified Javascript is served by MaxCDN, using a URL like:

[[ https://cdn.rawgit.com/pjkundert/ezpwd-reed-solomon/v1.8.0/js/ezpwd/ezcod.js]].

For example, to use the EZCOD position encoding API in your website, add this at the end of your
website template, before your application.js (see Asynchronous Loading, below, for =async=
loading and =jQuery= integration):
#+BEGIN_SRC HTML
<script type="text/javascript"
        src="https://github.com/pjkundert/ezpwd-reed-solomon/raw/master//cdn.rawgit.com/pjkundert/ezpwd-reed-solomon/v1.8.0/js/ezpwd/ezcod.js"
        srctest="static/js/ezcod.js"> <!-- access a sym-link to locally built version if desired -->
</script>
#+END_SRC

Note the current <VERSION> number encoded in the URI as =.../v<VERSION>/...".

Your application may permanently access any current or historical version of
the EZPWD Reed-Solomon Javascript which has a Tag in the official Git repo,
by encoding the <VERSION> number in the URL: 
: //cdn.rawgit.com/.../v<VERSION>/...

*** Python Interface using Swig

A high-performance Python 2/3 interface is provided, using [[http://www.swig.org/][Swig]] (requires at
least Swig Version 3.0.5 for C++11 support).  Since the Python module is
generated from the C++ interface, it must be generated and compiled using
the appropriate target OS, Python and C++ compiler implementation.

Therefore, there is no =pip install ezpwd_reed_solomon= available for the
Python bindings; they must be built and installed from the
ezpwd-reed-solomon source, on the target platform, using the platform's C++
compiler with C++11 support.

To build and install the =ezpwd_reed_solomon= package, obtain the source
from https://github.com/pjkundert/ezpwd-reed-solomon, and build with the
version of Python (2 or 3) you wish to support:
: $ cd ezpwd-reed-solomon/swig/python
: $ python setup.py install
or:
: $ cd ezpwd-reed-solomon
: $ make swig-python

Presently, only =ezpwd_reed_solomon.ezcod= is available; see the [[EZCOD:
Location Code API]] section for Python API details.

**** Swig 3.0.5+

 To install the Python API, you'll need a modern Swig.

 On Mac, use homebrew:
 : $ brew install swig

 On Linux Debian or Ubuntu Linux systems, you should be able to use
 something like this (other Linux variants should have similar package
 installation facilities):
 : $ apt-get -u install autoconf autogen libpcre3-dev bison yodl
 : $ git clone git@github.com:swig/swig.git
 : $ cd swig
 : $ autoconf
 : $ ./autogen.sh
 : $ ./configure && make && make install

** Licensing

All ezpwd-reed-solomon Reed-Solomon API code is available under both GPLv3 and Commercial licenses. Phil's original Reed-Solomon code is LGPL, so my Reed-Solomon implementation in =.../c++/ezpwd/rs_base= (which uses Phil's, with some improvements and conversion to C++) is available under the terms of the LGPL. However, my =ezpwd::RS<...>= implementation (found in =.../c++/ezpwd/rs=) may be used either under GPLv3+ or Commercial licenses, but not under LGPL.

The BCH implementation is based on Ivan Djelic's excellent implementation, also used in the Linux Kernel -- but licensed GPLv2+ (see: [[https://github.com/Parrot-Developers/bch]]). This means that off of ezpwd-reed-solomon's BCH APIs must be licensed GPLv2 or (at our option) any newer GPL version; we choose to license our implementation GPLv3+.

*** GPLv3+ Licensing

If your application complies with the terms of the GPLv3, then you can use
EZPWD Reed-Solomon based APIs without cost.  All users of your software
(eg. an installed application) or "software as a service" (eg. a website)
must have access to all of the software source code so they can freely
modify, rebuild and run the software.  Any modifications to underlying GPLv3
software (ie. EZPWD) must also be made available.

*** Commercial Licensing and Pricing

If you use any of the EZPWD Reed-Solomon based APIs in your product but you don't wish to make
your product's or website's source code available, then a Commercial license from [[http://hardconsulting.com/products/13-reed-solomon][Hard
Consulting Corporation (hardconsulting.com)]] is available.  Annual support (for either Commercial
or GPL projects is also available).  The pricing breakdown is as follows (in USD$):

|------------+-------+---------+---------------------------------|
| Users avg. | Price | Support | Included application assistance |
| (monthly)  |  USD$ | USD$/yr |                                 |
|------------+-------+---------+---------------------------------|
| <1K        |   100 |      25 | Interesting project? ask... :)  |
| 1K-1M      |  1000 |     250 | Up to 4 hour                    |
| >1M        | 10000 |    2500 | Up to 8 hours                   |
|------------+-------+---------+---------------------------------|

Use of the EZCOD robust geolocation encoding module of EZPWD Reed-Solomon is
free, forever, for any application.  It is available under both GPLv3 and
free Commercial licenses, and may even be re-implemented freely in any
language, so long as it remains compatible (includes the Reed-Solomon error
correction, and equivalent encoding and decoding of Latitude and Longitude
coordinates).

Call us at +1-780-970-8148 or email us at info@hardconsulting.com to discuss
your application.

** Enhancements

Several enhancements have been made to Phil's implementation.

*** Rejects impossible error position

Phil's version allows the R-S decode to compute and return error positions
with the unused portion of the Reed-Solomon codeword.  We reject these
solutions, as they provide indication of a failure.

The supplied data and parity may not employ the full potential codeword size
for a given Reed-Solomon codec.  For example, and RS(31,29) codec is able to
decode a codeword of 5-bit symbols containing up to 31 data and parity
symbols; in this case, 2 parity symbols (31-29 == 2).

If we supply (say) 9 data symbols and 2 parity symbols, the remaining 20
symbols of unused capacity are effectively filled with zeros for the
Reed-Solomon encode and decode operations.

If we decode such a codeword, and the R-S Galois field solution indicates an
error positioned in the first 20 symbols of the codeword (an impossible
situation), we reject the codeword and return an error.

*** Shared data tables w/ no locking required

Instead of re-computing all of the required data tables used by the
Reed-Solomon computations, every instance of RS<CAPACITY,*> with compatible
Galois polynomial parameters shares a common set of tables.  Furthermore,
every instance of RS<CAPACITY,PAYLOAD> w/ compatible Galias polynomial
parameters shares the tables specific to the computed number of parity
symbols.

The initialization of these tables is intrinsically thread-safe.

** ezpwd::RS<...>: C++ Reed-Solomon API

C++ implementation of Reed-Solomon codec. Fully implemented as inline code, in C++ header files. Highly performant, in both C++ and Javascript.

+BEGIN_SRC C++

include <ezpwd/rs>

// Reed Solomon codec w/ 255 symbols, up to 251 data, 4 parity symbols ezpwd::RS<255,251> rs;

std::vector data;

// ... fill data with up to 251 bytes ...

rs.encode( data ); // Adds 4 Reed-Solomon parity symbols (255-251 == 4)

// ... later, after data is possibly corrupted ...

int fixed = rs.decode( data ); // Correct errors, and data.resize( data.size() - rs.nroots() ); // Discard the 4 R-S parity symbols

+END_SRC

See =rssimple.C= for some basic examples. Note that =std::vector data= is adequate for Reed-Solomon "symbols" of up to 8 bits (eg. =ezpwd::RS<32,...>=, =ezpwd::RS<255,...>=). If you use =ezpwd::RS<511,...>= to =ezpwd::RS<65534,...>= (9-bit to 16-bit Reed-Solomon symbols), you'll need to provide vectors of =uint16_t= data to contain the larger symbols.

*** Constructing an RS(SIZE,PAYLOAD) Instance

When you decide on an N-bit symbol, how do you decide on and create an
instance of a Reed-Solomon codec (coder/decoder) appropriate for your data
payload?

Chose your R-S Codeword symbol bit size and hence your R-S
Codeword =SIZE=.  Then decide how many erroneous/missing symbols you need to be
able to correct for and hence your number of =PARITY= symbols required.

Now you have =SIZE=, and =PAYLOAD= is =SIZE-PARITY=.

Finally, break your data into chunks of at most =LOAD= (chunks of size =<
LOAD= will be internally considered to be padded with NUL/0 symbols; you
don't need to provide exactly =LOAD=-sized chunks).

For example, for 8-bit symbols, =SIZE= = 2^8-1 = 255, and for 4 symbols of
=PARITY=, =LOAD= = 255-4 = 251.  Therefore, the notation for the
Reed-Solomon codec is RS(255,251), and the C++ declaration for such a R-S
codec is:
#+BEGIN_SRC C++
ezpwd::RS<255,251> rs; // Up to 251 symbols data load; adds 4 symbols parity
#+END_SRC

**** Codeword =SIZE= is always 2^N-1

 For example, 8-bit symbols always use an RS(255,255-PARITY) codec.  For 5-bit
 symbols (or, to correct only the bottom 5 bits of a larger symbol), you
 would use an RS(31,31-PARITY) codec.

**** Codeword =PARITY= may be from 1 to =SIZE=-1

 You may specify an R-S codec specifying a codeword with as little as 1
 symbol of data payload and the remainder R-S parity, to as little as 1
 symbol of parity and the remainder data payload.

*** Encoding Your Data w/ =PARITY= R-S Parity Symbols

The encode method can add symbols to a =std::string= or =std::vector<T>=
(where =T= is =uint8_t= or =uint16_t=) container:
#+BEGIN_SRC C++
std::string data( "Hello, world!" )
rs.encode( data ); // Add the 4 R-S parity symbols to data
#+END_SRC

Alternatively, you can keep the parity separate, and not interfere with the
original data (the container is not resized):
#+BEGIN_SRC C++
std::string data( "Hello, world!" );
std::string parity;
rs.encode( data, parity ); // resize and place rs.nroots() parity symbols
#+END_SRC

Or, if you provide a fixed-size =std::array<size_t,T>=, it will presume that the
space for parity must already there at the end:
#+BEGIN_SRC C++
std::array<17,uint8_t> data(
    'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!', // 13
    'x', 'x', 'x', 'x' );                                            // 4
rs.encode( data ); // Place the 4 R-S parity symbols at end of data
#+END_SRC

Or, pass pairs of =uint8_t= or =uint16_t= iterators into any container or
buffer you desire:
#+BEGIN_SRC C++
std::vector<uint8_t> data( 255 );
// Fill data with 251 bytes of payload, eg:
for ( uint8_t i = 0; i < 251; ++i )
    data[i] = i;
// Append 4 symbols of R-S parity, using pairs of iterators
rs.encode( std::make_pair( data.cbegin(),       data.cbegin() + 251 ),
           std::make_pair( data.begin()  + 251, data.begin()  + 255 ))
#+END_SRC

*** Decoding Data w/ Corrupt/Missing Symbols

Once your data payload+parity is received, it may contain unknown erroneous
symbols (called "errors"), or known missing symbols (called "erasures").
Erasures are easier to correct (because we know their location), to they
only consume one R-S parity symbol to correct.  Unknown errors, however, are
lost in both position and value, so they each consume 2 R-S parity symbols
to correct.

If the R-S algorithm can correct any errors and erasures  present and
recover a valid R-S "codeword", it will report a positive value:
#+BEGIN_SRC C++
int correct = rs.decode( data );
if ( correct >= 0 )
    std::cout << "Recovered data w/ " << correct << " errors" << std::endl
else
    std::cout << "Failed to recover data; " << rs << " overwhelmed" << std::endl;
#+END_SRC

If desired, you can pass erasure positions, and get back recovered error
positions (remember that =erasures= symbols reported missing might not
actually be incorrect, so might not be reported back in =position=!):
#+BEGIN_SRC C++
std::vector<int> erasures = { 1 }; // Report second symbol missing
std::vector<int> position; // And get back corrected symbols here
int correct = rs.decode( data, erasures, &position );
#+END_SRC

*** Discard The =PARITY= R-S Parity Symbols

In all cases where =rs.encode()= has added symbols to a resizable
=std::string= or =std::vector<T>= container, it is your responsibility to
remove them after =rs.decode()= finishes.  The =rs.nroots()= method reports
the number of parity symbols.

** Build Dependencies

To support building on non-Linux platforms, [[https://github.com/pjkudert/bch.git][we have added a linux/errno.h]] file to Djelic's original version.

To use =ezpwd/bch= in your C++ code, you must add the following to your C++ compilation (=CXXFLAGS=): : -I c++/ezpwd/bch_include

This includes all of the "shim" include files required to compile Djelic's Linux kernel BCH implementation in a non-Kernel environment.

** ezpwd::...bch: C++ BCH API

A C++ implementation in many ways similar to the ezpwd::RS<...> is provided. There are 3 classes (=ezpwd::bch_base=, =ezpwd::bch<...>= and =ezpwd::BCH<...>=), but the recommended one is =ezpwd::BCH<...>=.

Creating a BCH codec w/ precisely the desired codeword size, payload and bit-error correction capacity (constructor throws exception if no match BCH codec is available): : #include <ezpwd/bch> : ezpwd::BCH<255,239,2> bch_codec; // By Codeword, Payload and Correction capacities, exactly

Encoding into a container of uint8_t: : std::vector codeword = { 0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF } // 8 data : bch_codec.encode( codeword ) // + 2 parity added

Decoding (and correcting errors) : int corrections = bch_codec.decode( codeword ); : assert( corrections > 0 ); // fail if BCH decode failed : codeword.resize( codeword.size() - bch_codec.ecc_bytes() ); // discard parity

The =encoded= and =decoded= methods return a copy of the supplied std::string or =vector=/=array= container of uint8_t, optionally with a =std::vector= of error positions filled in. The =encoded= adds the parity; =decoded= corrects the errors, optionally filling in the positions.

Enhance some raw data w/ BCH parity: : std::vector data = { 0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF } // 8 data : std::vector codeword = bch_codec.encoded( data ); // 8 data + 2 parity

Introduce an error into the parity-enhanced BCH codeword, and ensure that the recovered error positions matches the expected number and location of the error introduced: : std::vector erroneous = codeword; : erroneous[1] ^= 1<<3; // introduce an error in the 4rd bit of the 2nd byte; the 12th bit (bit index 11) : std::vector positions; : std::vector corrected = bch_codec.decoded( erroneous, positions ); : assert( corrected == codeword && positions.size() == 1 && positions[0] == 11 );

*** Classic Djelic Linux Kernel API The stock Linux Kernel C API is retained as-is, and is made available in the =ezpwd::= C++ namespace. Initializing a BCH codec:

+BEGIN_SRC C++

#include <ezpwd/bch_base>
// Allocate a BCH codec w/ 2^8-1 == 255 bit codeword size, and 2 bits of correction capacity.
// This results in a BCH( 255, 239, 2) codec: 255-bit codeword, 239-bit data payload capacity,
// hence 255-239 == 16 bits of parity.
ezpwd::bch_control *bch = ezpwd::init_bch( 8, 2, 0 );
#+END_SRC

Run =bch_test= to see all available BCH codec.

Encoding parity bits on the end of an existing message is performed something like this:

#+BEGIN_SRC C++
std::array<uint8_t,10>      codeword= {
    0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF,     // data
    0, 0 };                         // parity, initialized to 0
unsigned int len = 8;
uint8_t *data = &codeword[0];
uint8_t *parity = &codeword[len];
ezpwd::encode_bch( bch, data, len, parity );
#+END_SRC

Decoding and correcting using the convenience API that receives error locations and applies
correction(s) to supplied data:
#+BEGIN_SRC C++
int corrections = correct_bch( bch, data, len, parity );
#+END_SRC

Of course, the stock Linux Kernel API is available; it does not correct in-place, and the caller
must perform the bit-error corrections at the error locations detected by the API:
#+BEGIN_SRC C++
unsigned int errloc[2]; // must be at least bch->t in size
int corrections = decode_bch( bch, data, len, parity, 0, 0, errloc );
for ( int n = 0; n < corrections; ++n )
    if ( errloc[n] < 8*len )
        data[errloc[n]/8] ^= 1 << (errloc[n] % 8);
#+END_SRC

See =bchsimple.C= and =bch_test.C= for some basic examples, and =bch_itron.C= for a more advanced
implementation in a real protocol.

** Javascript Library: js/ezpwd/rskey.js

: rskey_encode( , data, [ sep ] ) -- encode data to RSKEY : rskey_decode( , key ) -- decode RSKEY

PARITY of 2-5 is supported, with a maximum capacity of 31-PARITY bytes of base-32 encoded data (raw data expands by the factor ( * 8 + 4 ) / 5 when base-32 encoded). With PARITY 2, the maximum capacity is 18 bytes; with PARITY 5, 16 bytes.

The =data= may be an ArrayBuffer of byte-length <= ==. If a string is supplied, it may be a hex string beginning with '0x...' (all subsequent pairs of hex digits are used; any data beyond that is ignored). Otherwise, the string is decoded as utf-8 (of course, this means that you can't supply a utf-8 string that starts with '0x'...).

The optional =sep= parameter (default 5) is the cluster size to separate the RSKEY into; 0 specifies no separators.

Load the rskey.js Javascript into your project (see Asynchronous Loading, below, for =async= loading and =jQuery= integration):

+BEGIN_SRC HTML

<script type="text/javascript" src="//cdn.rawgit.com/pjkundert/ezpwd-reed-solomon/v1.8.0/js/ezpwd/rskey.js" srctest="static/js/rskey.js">

+END_SRC

Use rskey.js's API to encode your data into an easily human readable key. Call the =rskey__encode= API (with PARITY 2-5), specify the number of bytes of data to encode in the RSKEY's payload, and provide some data to encode (as a hex string "0x3344...", or as a utf-8 string):

+BEGIN_SRC Javascript

rskey_5_encode( 12, "Mag.1ckπ" ); "9MGNE-BHHCD-MVY00-00000-MVRFN"

+END_SRC

Later, you can decode it -- even if the user adds an error or two (the 'X', below), or skips a few symbols (if some were unreadable, as indicated by an '_', or the last few are simply not yet entered). Each error consumes 2 parity symbols, each erasure or missing symbol uses 1, therefore 1 error + 2 erasures results in 20% of parity remaining for validation:

+BEGIN_SRC Javascript

rskey_5_decode( 12, "9MGNE-BHHCD-MVY00-00000-MVRFN" ) {confidence: 100, data: ArrayBuffer, utf8: "Mag.1ckπ", hex: "0x4D61672E31636BCF80000000"} rskey_5_decode( 12, "9MGNE-BHHCD-MVY00-00X00-MVR" ) // 1 error, 2 not yet entered {confidence: 20, data: ArrayBuffer, utf8: "Mag.1ckπ", hex: "0x4D61672E31636BCF80000000"} rskey_5_decode( 12, "9_GNE-BHHD-MVY00-00X00-MVRFN" ) // 1 error, 2 unreadable w/ '' {confidence: 20, data: ArrayBuffer, utf8: "Mag.1ckπ", hex: "0x4D61672E31636BCF80000000"}

+END_SRC

If you have raw numeric data (eg. record IDs, data HMACs, etc), use the ArrayBuffer interface. You can supply any type of raw data, up to the capacity of the RSKEY (12 bytes, in this case). Then, even if errors are introduced on entry, they will be recovered if the parity is sufficient, and the returned Object's .data property will be an ArrayBuffer containing the original binary data, which you can used a TypedArray to access:

+BEGIN_SRC Javascript

ia = new Int32Array([0x31323334, 0x41424344, 0x51525354]) [825373492, 1094861636, 1364349780] rskey_5_encode( 12, ia.buffer ) // raw capacity is 12 bytes, w/ 5 parity "6GRK4-CA48D-142M2-KA98G-V2MYP" dec=rskey_5_decode( 12, "6GRK4-CA48D-142M2-KA98G-V2XXP" ) // XX are errors {confidence: 20, data: ArrayBuffer, utf8: "4321DCBATSRQ", hex: "0x343332314443424154535251"} new Int32Array( dec.data ) // recover original data [825373492, 1094861636, 1364349780]

+END_SRC

** RSKEY Demo: http://rskey.hardconsulting.com

Try changing the Parity, Data Size and Data. Try changing the Key by entering some _ (indicating a missing/invalid symbol). These are called Erasures in Reed-Solomon terms, and we can recover one Erasure with each Parity symbol. Try changing some Key values to incorrect values. These Reed-Solomon Errors each require 2 Parity symbols to detect and correct.

You can also access the Console (right click, select Inspect Element, click on "Console"), and enter the above =rskey_=... API example code.

** Example Node.JS: Encrypted Gift Card Codes

Lets say you have an online Widget business, and generate gift cards. You average about 5000 unique visitors/month over the year, with a peak of 25000 around Christmas. You want to make your gift card redemption more reliable and secure, and less painful for your clients.

Your RSKEY license cost would be $100, plus a $25/yr support subscription, and you would have access to an hour of time with a support developer to help you apply the js/ezpwd/rskey.js API to your website's gift card generation and redemption pages.

You decide to associate each gift card with the buyer's account (so you and the gift-card giver can know when the card is redeemed). So, each gift card RSKEY needs to contain:

*** Client Website RSKEY Implementation

On the client website, you would use something like this (see Asynchronous Loading, below, for

=async= loading and =jQuery= integration):

+BEGIN_SRC Javascript

<script type="text/javascript" src="//cdn.rawgit.com/pjkundert/ezpwd-reed-solomon/v1.8.0/js/ezpwd/rskey.js" srctest="static/js/rskey.js">

+END_SRC

To encode a position of center of the Taj Mahal dome to 3m accuracy (9 position symbols, the default) and 20mm accuracy (12 symbols), and with 3 parity symbols (5-nines confidence):

+BEGIN_SRC Javascript

ezcod_3_12_encode( 27.175036985, 78.042124565 ) // default: 3m (9 symbols) "MMF BBF GC1.2U2" ezcod_3_12_encode( 27.175036985, 78.042124565, 12 ) // 20mm (12 symbols) "MMF BBF GC1 A16.1VD"

+END_SRC

Later, if the EZCOD is entered, errors and erasures are transparently corrected, up to the capacity of the Reed-Solomon encoded parity:

+BEGIN_SRC Javascript

ezcod_3_12_decode( "MMF BBF GC1 A16.1VD" ) Object {confidence: 100, latitude: 27.17503683641553, longitude: 78.04212455637753, accuracy: 0.020401379521588606} ezcod_3_12_decode( "MMF BBF GC1 A16.1" ) // missing some parity Object {confidence: 34, latitude: 27.17503683641553, longitude: 78.04212455637753, accuracy: 0.020401379521588606} ezcod_3_12_decode( "mmf-bbf-Xc1-a16.1vd" ) // An error Object {confidence: 34, latitude: 27.17503683641553, longitude: 78.04212455637753, accuracy: 0.020401379521588606}

+END_SRC

Try it at [[http://ezcod.com][ezcod.com]]. Switch to "EZCOD 3:12", and enter "mmf-bbf-Xc1-a16.1vd" as the EZCOD. You will see a computed accuracy of 20.4mm, and observe that the 'X' (error) is corrected to "G". (The website defaults to 9 digits of precision, so it will re-encode the position, discarding the extra precision.)

*** Asynchronous Loading

Emscripten-generated code must have its run-time initialized before it can be called. If you get Javascript resources normally, they will load asynchronously, but be run in the order you load them so the Emscripten run-time will be safely initialized before your applivation's Javascript runs.

If you load other Javascript libraries like jQuery and your application.js, and you load ezcod.js asynchronously, you must ensure that they do not use any Emscripten libraries (such as ezcod.js) until their run-time initialization is complete. Our Emscripten-based libraries are completely self-contained, so you can use the =

<script type="text/javascript" async src="//cdn.rawgit.com/pjkundert/ezpwd-reed-solomon/v1.8.0/js/ezpwd/ezcod.js" srctest="static/js/ezcod.js"> <script defer onload="jquery_loaded()" src="//ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"> <script defer src="js/app.js">

+END_SRC

** Robustness

All symbols after the initial 9 are Reed-Solomon code symbols. Each R-S symbol can recover one known erasure; every two R-S symbols can detect and correct one other erroneous symbol. If any R-S symbols remains unused in excess of all erasures and errors, then the entire sequence can be confirmed as an R-S "codeword", and its validity is assured, to a certainty probability of: : P(1-1/2^(5*excess))

For example, with one R-S symbol remaining, the probability that the EZCOD is correct is: : P(1-1/2^5) == .969 If two excess R-S symbols exist, then the probability rises to: : P(1-1/2^10) == P(1-1/1024) == 0.999 With 3, it's: : P(1-1/2^15) == P(1-1/32768) == 0.99997

Therefore, if extremely robust positions are required, select an EZCOD with 3 parity symbols, yielding almost 5-nines reliability in transmitting accurate position information -- even if it must be written down, recited or entered by a human.

** Precision

To identify the location of something within +/- 10 feet (3m) is simple: you must specify the Latitude (-90,90) to within 1 part in 4,194,304 (2^22) and Longitude (-180,180) to within 1 part in 8,388,608 (2^23).

The default 10-symbol EZCOD transmits 22 bits of Latitude and 23 bits of Longitude in 9 symbols of position data (the 10th is a parity symbol). The EZCOD API can encode up to 12 symbols of position data (29 bits of Latitude, and 31 bits of Longitude), yielding a maximum precision capability of +/- 20 millimeters.

Since the earth's circumference at the equator is ~40,075,000m, each part in both vertical and horizontal directions is 40,075,000 / 8,388,608 == 4.777m. If you can specify a rectangle having sides of length equal to one part in the vertical and horizontal direction, then at the equator, you have a square that is 4.777m on a side. So, if we know which square some geographical coordinate lies within, it is at most sqrt( 2 * (4.777/2)^2 ) == 3.378m distant from the center of the square.

As you travel north or south, the circumference of the Longitude lines decreases, as absolute Latitude increases. The average radius of the earth is ~6,371,000m. At 53 degrees North, the circumference of the earth along a line of fixed Latitude is: : 2 pi radius cos( Latitude ) : 2 3.1415926534 6,371,000m 0.60181502315 : 24,090,760m

Thus, each part along the vertical axis is still 4.777m, but each horizontal part is: : 24,090,760 / 8,388,608 == 2.872m.

Now the point within each rectangle is at most: : sqrt( (4.777/2)^2 + (2.872/2)^2 ) == 2.787m distant from the center of the rectangle.

Thus, with 9 symbols of position data, the precision of such a Latitude/Longitude encoding is at worst +/- 3.378m at the equator, at best +/-2.389m at the poles, and has an average error of less than +/-3m.

** EZCOD Demo: http://ezcod.com

To see EZCOD in action, visit [[http://ezcod.com][ezcod.com]]. Try entering: : R3U 1JU QUY.0 to see where my daughter Amarissa was born.

You can also access the Console (right click, select Inspect Element, click on "Console"), and enter the above =rskey_=... API example code.

*** EZCOD REST API Demo

A self-hosted website like [[ezcod.com]] with an EZCOD converstion REST API can
be made available on [[http://localhost:8000]] by installing the Python
=ezpwd_reed_solomon= module and running =examples/ezcod_api/server.py=.  On
a Mac, the complete process for this is:
: $ git clone https://github.com/pjkundert/ezpwd-reed-solomon.git
: $ brew install swig
: $ make -C ezpwd-reed-solomon/swig/python install
: $ pip install web.py
: $ cd ezpwd-reed-solomon/examples/ezcod_api
: $ ./server.py --prefix api --bind localhost:8000

| Argument                | Description                                         |
|-------------------------+-----------------------------------------------------|
| =--bind <iface>:<port>= | Bind the web server to the given interface and port |
| =--analytics <id>=      | Issue Google Analytics code using the given ID      |
| =--prefix <path>=       | Host the REST API at the URL: <path>/<version>      |
| =--log <ffile>=         | Put logs into the given file                        |

The REST API URL always includes the version =v#.#.#=; for the above command
the API is hosted at: http://localhost:8000/api/v1.8.0.  To get the details
for an EZCOD, encode a request with the EZCOD as a query option.  For
example, visit this with a web browser:
http://localhost:8000/api/v1.8.0?ezcod=r3u08mpvt.d.  This will return the
decoded data as HTML.  To get it in JSON form, append =.json= to the API
requests path: http://localhost:8000/api/v1.8.0.json?ezcod=r3u08mpvt.d.

This demo application supports GET query options and POST form variables (or
body JSON of the form ={...}= or =[{...},...]= with object properties)
matching:

| Keyword     | Description                                |
|-------------+--------------------------------------------|
| =ezcod=     | An EZCOD 3:10/11/12                        |
| =latlon=    | A "Lat,Lon" pair as a string               |
| =latitude=  | A geographic Latitude in degrees           |
| =longitude= | A geographic Longitude in degrees          |
| =precision= | The number of symbols of geolocation data  |
| =parity=    | The number of desired EZCOD parity symbols |

For example, to get the details of an EZCOD using =wget=:
: $ wget -S --header='Content-Type: application/json'         \
:     -qO - --post-data '{"ezcod":"r3u08mpvt.d", "parity":3}' \
:     http://localhost:8000/api/v1.8.0
: HTTP/1.1 200 OK
:  Cache-Control: no-cache
:  Content-Type: application/json
:  Transfer-Encoding: chunked
:  Date: Wed, 03 May 2017 12:31:38 GMT
:  Server: localhost
: {
:    "accuracy": 0.0,
:    "certainty": 1.0,
:    "confidence": 100,
:    "ezcod": "R3U 08M PVT.JKG",
:    "latitude": 53.55553865432739,
:    "latitude_error": 0.0,
:    "longitude": -113.87387037277222,
:    "longitude_error": 0.0,
:    "precision": 9
: }

You can supply single objects, or a list:
: ... --post-data '[{"ezcod":"r3u08mpvt.d"},{"latlon:" "53.5,-113.8"}'

** =ezpwd_reed_solomon.BCH=

The BCH module provides several BCH codec classes. Presently, all are implemented using Ivan Djelic's implementation, straight from the Linux kernel.

The basic API (provided in =bch_base=) is specified in terms of Galois field order 'm' (from 5-9), and bit-error correction capacity 't'. A BCH codec is provided (if possible) with the specified capacity. The BCH codeword size is 2^m-1, and the number of ECC bits required to achieve bit-error correction capacity 't' is computed. Thus, the resultant =codec='s data payload capacity could be computed using: : 2 ** m - 1 - codec.ecc_bits()

Class Description -------------------------+------------------------------------------------ =bch_base= Obtain a BCH codec by specifying M (Galois order; codeword size == 2^M-1) and T (bit-error correction capacity)
=bch255= Specify a pre-defined codeword size
- =bch_255_1= (2^M-1), and bit-error capacity T
- =bch_255_2=
- =bch_255_3=
- =bch_255_4=
- =bch_255_5=
- =bch_255_6=
- =bch_255_7=
- =bch_255_8=
=BCH255-= Specific BCH codec by fully specifying
- =BCH_255_191_8= the Codeword, Payload and T bit-error capacity
- =BCH_255_199_7=
- =BCH_255_207_6=
- =BCH_255_215_5=
- =BCH_255_223_4=
- =BCH_255_231_3=
- =BCH_255_239_2=
- =BCH_255_247_1=

In the future, the BCH.BCH... version may be re-implemented using C++ templates, to provide optimizations available due to the predetermined fixed size of internal tables. Therefore, it is recommended that the fixed BCH... version be used if possible.

All codecs provide:

| Method | Description | |---------------+------------------------------------------------------------------------| | =t()= | The bit-error correction capacity | | =ecc_bits()= | The number of BCH parity bits | | =ecc_bytes()= | The number of BCH parity bytes | | =encoded()= | Return the BCH encoded data, w/ ECC bytes appended | | =decoded()= | Return the BCH decoded and corrected data (not discarding ECC bytes) |

Here's some sample code illustrating some simple use-cases for the =ezpwd_reed_solomon.BCH= module:

+LATEX: {\footnotesize

+BEGIN_SRC ipython :session :exports both

 from ezpwd_reed_solomon import BCH

 def flip( data, bit ):
   if isinstance( data, str ):
     return data[:bit // 8] + chr( ord( data[bit // 8] ) ^ 1 << bit % 8) + data[bit // 8 + 1:]
   elif isinstance( data, tuple ):
     return data[:bit // 8] + (data[bit // 8] ^ 1 << bit % 8,) + data[bit // 8 + 1:]
   elif isinstance( data, list ):
     return data[:bit // 8] + [data[bit // 8] ^ 1 << bit % 8,] + data[bit // 8 + 1:]
   raise RuntimeError( "Unhandled sequence: %r" % data )

 flexi16 = BCH.bch_base( 8, 2 )

 ori = 'abc'
 enc = flexi16.encoded( ori )

 # Add some bit-errors
 err = flip( enc, 14 )
 err = flip( err, 7 )
 #err = flip( err, 21 ) # over bit-error correction capacity

 # Decode and test
 dec = flexi16.decoded( err )
 assert dec[:3] == ori

 # The error positions can be returned; a special BCH.error_position container type
 # must be used (due to the vagaries of the Swig-generated Python wrapper).
 data = [0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF]
 codeword = flexi16.encoded( data )

 erroneous = list( codeword )
 erroneous[1] ^= 1 << 3 # introduce an error in the 4rd bit of the 2nd byte; 12th bit (bit index 11)
 positions = BCH.error_position()
 corrected = flexi16.decoded( erroneous, positions )
 assert corrected == codeword and len( positions ) == 1 and positions[0] == 11, \
   "'codeword:  %r'\n'erroneous: %r'\n'corrected: %r'\n'positions: %r'" % (
     codeword, erroneous, corrected, list( positions ))
 "'codeword:  %r'\n'erroneous: %r'\n'corrected: %r'\n'positions: %r'" % (
   codeword, erroneous, corrected, list( positions ))

+END_SRC

+RESULTS:

: 'codeword: (1, 35, 69, 103, 137, 171, 205, 239, 203, 187)' : 'erroneous: [1, 43, 69, 103, 137, 171, 205, 239, 203, 187]' : 'corrected: (1, 35, 69, 103, 137, 171, 205, 239, 203, 187)' : 'positions: [11]'

+LATEX: }

** =ezpwd_reed_solomon.ezcod=

Classes are provided to produce three variants of EZCOD by default: 3m (9 symbols) of location accuracy, plus 1, 2 or 3 Reed-Solomon parity symbols. They are named =ezcod_3_10=, =ezcod_3_11= and =ezcod_3_12=, respectively, indicating the default 3m accuracy, and the total number of symbols.

+BEGIN_EXAMPLE

$ python

from ezpwd_reed_solomon import ezcod

+END_EXAMPLE

The API supports the following classes, methods and attributes:

*** =ezcod3{10,11,12}(""|[lat,[lon,...]])=

Creates an <ezcod> instance containing the specified geolocation (defaults
to latitude 0.0, longitude 0.0, '.' separator and chunk 3).  If a string is
supplied, it is decoded (if possible; an Exception is raised if the
provided EZCOD is invalid).
#+BEGIN_EXAMPLE
>>> EZCOD = ezcod.ezcod_3_12( 53.5, -113.8 )
>>> print repr( EZCOD )
<R3U 06B MJ3.JXR (100%)  ==  +53.5000000000, -113.8000000000 +/-   0.00mm>
#+END_EXAMPLE

*** =ezcod3{10,11,12}.encode([precision])=

Encodes the current =ezcod_3_{10,11,12}='s =.latitude= and =.longitude= to
the given number of symbols of precision (default: 9, or 3m).  The accuracy
may be anywhere from 1 to 12 (20mm accuracy) symbols.
#+BEGIN_EXAMPLE
>>> print EZCOD.encode( 12 )
R3U 06B MJ3 EDD.K56
#+END_EXAMPLE

*** =ezcod3{10,11,12}.decode("")=

Any variant of =ezcod_3_{10,11,12}= can decode a valid EZCOD with the
expected amount (or more) parity, so long as it contains a '.' or '!' to
separate the position and R-S parity symbols.

The percentage certainty is returned -- the proportion of expected R-S
parity symbols that remain unused after error detection and correction.  A
value of 0 indicates that the EZCOD's R-S decoding did not fail, but no
parity symbols remain in excess to verify its validity.
#+BEGIN_EXAMPLE
>>> print EZCOD.decode( "r3u 06b mj3 edd.k56" )
100
>>> EZCOD.latitude
53.49999999627471
>>> print EZCOD.decode( "r3u O6b m_3 edd.k56" )
67
>>> print EZCOD.decode( "r3u O6b mX3 edd.k56" )
34
>>> print repr( EZCOD )
<R3U 06B MJ3.JXR ( 34%)  ==  +53.4999999963, -113.8000000734 +/-   19.4mm>
>>>
#+END_EXAMPLE

If any symbols are unknown, replace them with either =_= or =?= to
indicate that they are erasures (and consume only a single symbol of R-S
parity to correct).  Any undetected erroneous symbol corrected by the R-S
codec consumes 2 parity symbols.  A failure to decode (too many errors or
erasures) will raise a =RuntimeError= exception:
#+BEGIN_EXAMPLE
>>> EZCOD.decode( "r3u 06b mj3 __d.__6" )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    ...
RuntimeError: ezpwd::ezcod::decode: Error correction failed; too many erasures
>>> EZCOD.decode( "r3u 06b mj3 eXd.__6" )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    ...
RuntimeError: ezpwd::ezcod::decode: Error correction failed; R-S decode failed
#+END_EXAMPLE

If an EZCOD codec expecting fewer R-S parity symbols (eg. an EZCOD 3:10
codec) is used to decode an EZCOD with more parity (eg. an EZCOD 3:12 code
w/ 3 parity), it will only decode with the "strength" of the shorter codec.

For example, even though an EZCOD 3:12 offers almost 5-nines probability of
correctness (1-1/32^3 == P(.99997)), if you use an EZCOD 3:10 codec to
decode it, it will only use one of the R-S parity symbols, and thus only be
able to correct 1 erasure (instead of 1 error and 1 erasure).  Furthermore,
it will only be able to provide 1-nines probability of correctness (1-1/32
== P(.96875))
#+BEGIN_EXAMPLE
>>> ezcod.ezcod_3_12().decode("r3u 06b mj3 edd.k56")
100
>>> ezcod.ezcod_3_10().decode("r3u 06b mj3 edd.k56")
100
>>> ezcod.ezcod_3_12().decode("r3u 06b mj3 ed_.k56") # even though 3 parity available
67
>>> ezcod.ezcod_3_10().decode("r3u 06b mj3 ed_.k56") # all codec parity capacity used!
0
>>> ezcod.ezcod_3_12().decode("r3u 06b mj3 e__.k56")
34
>>> ezcod.ezcod_3_10().decode("r3u 06b mj3 e__.k56")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    ...
RuntimeError: reed-solomon: number of erasures exceeds capacity (number of roots)
#+END_EXAMPLE

*** =ezcod3{10,11,12}= Instance Attributes

The following attributes are available in each =ezcod_3_{10,11,12}=
instance:

| Attribute         | Type  | Range         | Description                              |
|-------------------+-------+---------------+------------------------------------------|
| =latitude=        | float | [-90,90]      | Geographical position in degrees         |
| =longitude=       | float | [-180,180]    | ''                                       |
| =latitude_error=  | float | [0,inf]       | Axis error in meters                     |
| =longitude_error= | float | [0,inf]       | ''                                       |
| =accuracy=        | float | [0,inf]       | Average of error ellipse axes in meters  |
| =precision=       | int   | [1,12]        | Desired number of location symbols       |
| =confidence=      | int   | [0,100]       | Percentage of parity symbols remaining   |
| =certainty=       | float | [0.0,1.0]     | Certainty that EZCOD decoded was correct |
| =chunk=           | int   | [0,6]         | Spaces every 'chunk' position symbols    |
| =separator=       | char  | '.', '!', ' ' | =SEP_NONE/DEFAULT/DOT/BANG/SPACE=        |
| =space=           | char  | ' ', '-'      | =CHK_NONE/DEFAULT/SPACE/DASH=            |
| =SEP_NONE=        | char  | '\xff'        | Output no position/parity separator      |
| =SEP_DEFAULT=     | char  | '\x00'        | Output no position/parity separator      |
| =SEP_DOT=         | char  | '.' (default) | Output a '.' position/parity separator   |
| =SEP_BANG=        | char  | '!'           | Output a '!' position/parity separator   |
| =SEP_SPACE=       | char  | ' '           | Output a ' ' position/parity separator   |
| =CHK_NONE=        | char  | '\xff'        | Output no space between chunks           |
| =CHK_DEFAULT=     | char  | '\x00'        | Output the default between chunks        |
| =CHK_SPACE=       | char  | ' ' (default) | Output a ' ' space between chunks        |
| =CHK_DASH=        | char  | '-'           | Output a '-' space between chunks        |

It is recommended to use either =SEP_DOT= (default) or =SEP_BANG= (avoid
=SEP_NONE=) for =separator=, so that the EZCOD parser can unambiguously
determine the total EZCOD size, and the number of parity symbols to expect.

** Javascript Library: js/ezpwd/rspwd.js

* Toward (O(Nlog(N)))

Moving toward faster algorithms is somewhat impeded by patent risk. However, there are [[https://news.ycombinator.com/item?id=14290617][some possible approaches]]. Here is an interesting Apache 2 licensed (allowing Commercial use and [[https://www.apache.org/licenses/GPL-compatibility.html][GPLv3 compatibility]]): [[https://github.com/Bulat-Ziganshin/FastECC][FastECC Reed-Solomon encoder by Bulat-Ziganshin]]. It achieves good performance on encoding, but does not implement Erasure or Error detection/correction. The key paper describing the algorithm:

A newer paper is implemented in the BSD licensed (allowing Commercial use and GPLv3 compatibility) [[https://github.com/catid/leopard][Leopard Reed-Solomon en/decoder by Christopher A. Taylor]]: