Closed justing-bq closed 4 years ago
When will an annotation ever be nil? If we don’t know the text form of a binary-encoded annotation, would we be better off returning either ${id} or an error? Nil seems strange (as opposed to field name, where it at least makes sense as “this value has no field name because it’s not in a struct”)
When will an annotation ever be nil?
When we read an undefined or unknown symbol ID. It's a poor substitute for returning a highly detailed error value that indicates which annotation couldn't be resolved and why.
Change reader.go::Annotations() return type from []string to []*string
This is one option, but I'm not sure it's the right one. I'm starting to lean more towards not offering string
APIs for symbol resolution use cases (field names, annotations, symbols) at all. We run into this leaky abstraction every time. Returning a SymbolToken
everywhere may be slightly more cumbersome, but at least it's not lossy.
At the risk of opening a whole can of worms... :)
I think this ultimately comes down to the fact that an Ion parser has two reasonably-distinct classes of users. On one hand, you've got "end users" who are reading Ion data with a more-or-less-known-at-build-time schema for use in their program. They pretty much never hit these kinds of edge cases, and just want an ergonomic way to access the data. On the other hand you've got "middleware" that's handling data with little to no knowledge of the schema, and wants to be able to losslessly process and pass along any valid-per-the-spec Ion data no matter how weird it is.
A Reader API that uses string
s to represent symbols, and treats things like empty-strings and invalid symbol IDs as edge cases, optimizes for the former type of user. An API that uses SymbolToken
s and *string
s optimizes for the latter. Neither approach is objectively better for everyone.
The key question for me, then, is do we have "end users" who are going to want to use the Reader/Writer APIs directly, or will they be fine going with the encoding/json
-style Marshal/Unmarshal APIs? I'm coming at this from the perspective of an "end user" with previous experience using the reader/writer APIs in other languages, so I was trying to keep them nice to use from that perspective; it's entirely possible that's a mistake though, and they'll only ever be directly used by folks in the latter group?
(Which I guess is a long way of saying yeah I could be +1 for switching the Reader/Writer APIs to use SymbolToken
everywhere. :))
My expectation is that the vast majority of users will be using marshall
/unmarshall
. I'd prefer for the Reader
/Writer
to have thorough and consistent contracts even if they're a bit unwieldy; anyone who wants a string
-centric Reader
that represents corner cases with ""
could easily make a layer on top of the more tedious official Reader
. @therapon can add his thoughts here too. We'll be discussing this later today.
I agree with the statement that there are likely 2 categories of users with one category--the one using marshall/unmarshall--being the majority of cases. With the second category that will need finer control and lower-level integration there is a cost associated with the lower level API. I am also of the opinion that the contracts should be clear and guide towards correct usage. While ergonomics are important I am willing to sacrifice a little convenience for correctness in the lower level APIs.
On that note, I think making clear distinctions of Ion data (null, empty etc.) is important. The SymbolToken
abstraction I think is our best bet to start with.
Here is the proposed SymbolToken
abstraction:
On the Reader
API:
Annotations() []string -> Annotations() []*SymbolToken
FieldName() *string -> FieldName() (*SymbolToken, error)
FieldNameSymbol() (*SymbolToken, error) -> Remove
SymbolValue() (*SymbolToken, error) -> No change
StringValue() -> Return error if type is Symbol
If symbol
is null, return nil, nil
If symbol
was unresolved, return nil, err
If symbol
was resolved, return SymbolToken, nil
Am I missing anything?
If symbol is null, return nil, nil If symbol was unresolved, return nil, err If symbol was resolved, return SymbolToken, nil
Am I missing anything?
I think we need to accommodate the case where the symbol was legal (not > MaxId
) but we couldn't resolve it due to a missing table or an explicit definition of null
. In that case you'd return SymbolToken, nil
, but the SymbolToken
would have Text=nil
.
Annotations() []string -> Annotations() []*SymbolToken
A value cannot have null.symbol
annotations, so what would a nil
appearing mid-slice indicate here? What do we think about returning ([]SymbolToken, error)
instead? If any of the annotations are > MaxId
, err
would be set.
FieldName() *string -> FieldName() (*SymbolToken, error)
FieldNameSymbol() (*SymbolToken, error) -> Remove
SymbolValue() (*SymbolToken, error) -> No change
StringValue() -> Return error if type is Symbol
These look right to me. @therapon?
Having #158 merged, I believe this is not relevant anymore and can be closed.
Change reader.go::Annotations() return type from []string to []*string so we can distinguish between "" and nil.
Discussed in https://github.com/amzn/ion-go/pull/146.