svenvc / ston

STON - Smalltalk Object Notation - A lightweight text-based, human-readable data interchange format for class-based object-oriented languages like Smalltalk.
MIT License
135 stars 32 forks source link

Serializing & materializing a Bag with duplicate elements fails #22

Closed Rinzwind closed 5 years ago

Rinzwind commented 5 years ago

The following variation on STONWriteReadTests>>#testCollections signals an error:

testCollections2
    | collections |
    collections := STON listClass withAll: {
        Bag withAll: { 1@2. 3@4. 1@2. 3@4. 1@2 } }.
    self serializeAndMaterialize: collections

STONReaderError: At character 37: 'Inconsistent reference resolution'

Tested with the version of STON as included in Pharo 7 (“Pharo-7.0.0+rc1.build.117.sha.65868b2ab36e77ebb1db5f750725205e92999116 (64 Bit)”).

svenvc commented 5 years ago

Yes I confirm the problem, here is an even shorter version:

STON toString: (Bag withAll: { 1@2. 1@2. 1@2 }). STON fromString: 'Bag [ Point [ 1, 2 ], @2, @2 ]'.

I know the cause, I just have to think a bit about it.

svenvc commented 5 years ago

https://github.com/svenvc/ston/commit/6a894a483ff3c7d963e914c96338660c9ecad278

should solve your issue.

thanks again for reporting it.

Rinzwind commented 5 years ago

Thanks! That seems to work. Two observations though:

An alternative solution I had in mind is to postpone adding the elements to the Bag to happen after the reference resolution. When creating the Bag, put its elements in an OrderedCollection that serves as a “placeholder”:

Bag class>>fromSton: stonReader

    | elementsPlaceholder collection |
    elementsPlaceholder := OrderedCollection new.
    collection := self with: elementsPlaceholder.
    stonReader parseListDo: [ :each |
        elementsPlaceholder add: each ].
    ^ collection

Then in #stonProcessSubObjects:, add the elements to the Bag after resolving their references:

Bag>>stonProcessSubObjects: block

    | elementsPlaceholder |
    elementsPlaceholder := self anyOne.
    self removeAll.
    elementsPlaceholder do: [ :element |
        self add: (block value: element) ].

This also makes the #testCollections2 given above pass. As a sanity-check, it's also able to read a self-referencing Bag (b := STON fromString: 'Bag[@1]'. b anyOne == btrue) (this wouldn't be the case, I think, if the reference-resolution were done in #fromSton:, which is what I initially had in mind). What does not work yet is reading a Bag without any references, because in that case #stonProcessSubObjects: is simply not sent; I'm not sure yet how best to handle that.

svenvc commented 5 years ago

By design and by nature, a serialisation format like STON exposes the internal structure of objects. It is just for a limited number of classes that a reasonably abstract and more implementation independent representation can be chosen.

For this reason, I do not consider cross-platform or inter-version compatibility as real goals. But of course I will try to maintain it.

With that being said, I am considering using another custom representation of Bags, with the value->occurences dictionary directly exposed, like

Bag { #a->2. #b->3 }

Would that work on Gemstone ?

BTW, I consider the old representation

 Bag [ #a, #a, #b, #b, #b ]

as almost a bug, since it explodes the Bag's size.

Rinzwind commented 5 years ago

Ok, having a custom representation for Bags that avoids the size explosion and works across Smalltalk dialects seems like a good idea. (The GemStone implementation of Bag has more and different instance variables than the Pharo one: Bag allInstVarNames#(#'_varyingSize' #'_numEntries' #'_indexedPaths' #'_levels' #'dict' #'size'), dict is like contents in Pharo, size is the size of the Bag; the other variables are inherited from UnorderedCollection).

svenvc commented 5 years ago

OK, done.

https://github.com/svenvc/ston/commit/c8cfdda181482b24b97189de83c54cf2cbcbf152

I also updated https://github.com/svenvc/ston/blob/master/ston-spec.md with a subsection about collections and Bag.

There are some more unit tests too

Rinzwind commented 5 years ago

Ok, thanks! I think we can consider this issue as closed.