svenvc / ston

STON - Smalltalk Object Notation - A lightweight text-based, human-readable data interchange format for class-based object-oriented languages like Smalltalk.
MIT License
135 stars 32 forks source link

Symbols containing non-byte characters are not correctly serialized #25

Closed blairmcg closed 5 years ago

blairmcg commented 5 years ago

If a Symbol contains characters outside the byte character set, then they are not correctly serialized by STON.

Some examples in Pharo 7:

ston := String streamContents: [:s | STON put: #'€' onStream: s]. "#'€', should be '#''€'''"
STON fromString: ston. "STONReaderError: At character 1: unexpected input"

Or from outside the basic plane:

ston := String streamContents: [:s | STON put: (Character value: 16r1F42C) asSymbol onStream: s]. "'#🐬'"
STON fromString: ston.

The issue is in STONWriter>>isSimpleSymbol:, which effectively ignores characters with code points > 255:

STONWriter new isSimpleSymbol: (Character value: 16r1F42C) asSymbol "true"
svenvc commented 5 years ago

Hi,

I am only now seeing this issue, sorry. Thanks for the excellent bug report.

I will fix this ASAP.

Sven

svenvc commented 5 years ago

https://github.com/svenvc/ston/commit/049da2fad16df1a470da5fd8c4f748562250b7f5 should fix this issue.

Thanks again for the report.