Handle text encodings - Githubissues

The CA property defines the encoding used for for SimpleText and Text type values.

The library currently has no special handling for text encodings, it simply reads the characters it receives from SGFC into std::string. This must be thought through. Some questions that need to be answered:

Does the library need to deal with character encodings at all?
Does it make sense to convert everything to a single well-defined encdoing, preferrably UTF-8?
How does the library deal with multi-byte character encodings such as UTF-16?

The status quo is this:

The library cannot handle multi-byte character encodings such as UTF-16, it treats everything as 1-byte characters.
The library does not interpret property values, except for whitespace and backslashes. SGFC does slighhtly more as it also scans character sequences for ":" characters.
The library client is responsible for taking the c_str() of property values and treat it according to the encoding defined by the CA property.

Libraries that can deal with encoded strings:

ICU: http://site.icu-project.org/
Boost.Locale: http://www.boost.org/doc/libs/1_53_0/libs/locale/doc/html/index.html

Other references:

https://www.cprogramming.com/tutorial/unicode.html

herzbube / libsgfcplusplus

Handle text encodings #20