dhhoang / IonDotnet

Amazon Ion ( http://amzn.github.io/ion-docs/ ) library for dotnet
MIT License
12 stars 4 forks source link

Ion writer flush/finish behavior #12

Open dhhoang opened 6 years ago

dhhoang commented 6 years ago

Related to #6 . We need to implement IonWriter behaviors regarding flush() and finish() to adhere to Ion specs.
I want to make sure we're on the same page so please correct me if I'm wrong.
The way I understand it now flush() is supposed to make the data available to the reader end while finish() will mark the end of a datagram/message. Local symbol tables are to survive between flush()es but not between finish()es. If the writer writes more data after calling finish(), the local symbol tables are essentially erased and IVM is written to mark a new datagram. If that's the case we'd need to test this specific behavior (survival of local tables between flush()es and finish()es. We would also need to make sure that finish() actually does call flush() to flush pending data (if any). Furthermore, this behavior should be consistent between different writers (text and binary).

@rickardp

rickardp commented 6 years ago

This is also how I understand the spec (I assume you mean IonWriter, as IonReader should just handle either). I think the current implementation of Finish() is not correct (even without the local symbol table append feature). The currently failing Boolean test is an example of this. Should fixing this be handled in PR #6 as well?

dhhoang commented 6 years ago

@rickardp Yes that's my typing mistake, it should be IonWriter.

dhhoang commented 6 years ago

@rickardp If you look at my latest commits finish() now calls flush() in addition to marking the end of the datagram (which I guess is the way it's supposed to be). flush() will now cause a new local symbol table to be appended if neccessary (that is, if the writing encounters some previously unknown symbols). I guess we could do it your way with a new SymbolState or use a flushed_max_id field like ion-c.

rickardp commented 6 years ago

Great work! It now looks like it follows spec. As for the SymbolState vs max flushed ID, I did not need the info on which symbol was flushed, that’s why I introduced the new state. I would go for whatever makes the code most readable if they are equivalent. I think the C writer needs this to know where to flush from, but this info is already in the inner writer in our case.