I have found a problem when attempting to store data which contains non-ascii characters. "Décor" for example with the accented é.
The issue seems to be rooted in a mis-step with calculating the string length for passing from the managed c# to the un-managed leveldb.
The call to get the expected length uses the UTF8 encoding but then the actual marshaled call passes as the current ANSI code page. In the ANSI page (for a US system) the special é takes only a single byte but in UTF8 is takes 2 bytes.
The end result is that the record is saved with an extra character of garbage, specifically one extra garbage character for each special character in the input.
I have found a problem when attempting to store data which contains non-ascii characters. "Décor" for example with the accented é.
The issue seems to be rooted in a mis-step with calculating the string length for passing from the managed c# to the un-managed leveldb.
The call to get the expected length uses the UTF8 encoding but then the actual marshaled call passes as the current ANSI code page. In the ANSI page (for a US system) the special é takes only a single byte but in UTF8 is takes 2 bytes.
The end result is that the record is saved with an extra character of garbage, specifically one extra garbage character for each special character in the input.