amazon-ion / ion-go

A Go implementation of Amazon Ion.
https://amazon-ion.github.io/ion-docs/
Apache License 2.0
175 stars 31 forks source link

Remove the Digest API and Replace with a Loader Interface #22

Open almann opened 4 years ago

almann commented 4 years ago

We currently have a Digest type that is similar to Datagram in ion-java. In other implementations, such as ion-python, we use a native container type to represent the stream of Ion values.

I am inclined to remove this type as it does not seem to provide value on top of []Value and the name is particularly confusing in the context of Ion hash.

I do think we should have some kind of Loader interface to encapsulate this once we have a Scanner interface with #12.

E.g.

type Loader interface {
  Load(s Scanner) (Value, error)
}

The Load method would return an error if their was some kind of parsing error or an end of stream was encountered.

It would have to be further specified if the Load method implies iterating to the next value, or if it operates on the current value pointed to by the Scanner. My recommendation would be to make it operate on the current value for consistency with other implementations.

To make this API more ergonomic, we could consider providing functions that make Loader easier to use.

E.g.

type ValueHandler func(v Value, e error) error

// StreamTo is a functional for-each, if the handler returns a non-nil error, it stops iteration
// This returns the first error from loading/scanning or the handler.
func StreamTo(l Loader, s Scanner, ValueHandler) error {...}

// LoadSlice loads a number of Values from a Loader/Scanner
// Will not exceed the length of p and returns the length of items loaded
func LoadSlice(l Loader, s Scanner, p []Value) (int32, error) {...}

Alternatively, we could do something with channels, but the above could be reasonably sufficient.

mrutter-amzn commented 4 years ago

Digest was primarily a placeholder for a Datagram-alike. I would suggest making something like it that is a type alias for []Value to provide a little bit of future flexibility.

Why does ValueHandler take an error? It's for handling values, not for handling lexing/parsing failures. If it was a StreamHandler that might make more sense.

I don't like having both Loader and Scanner passed in. StreamTo and LoadSlice shouldn't have to care about the Scanner, they should worry about having the Loader give them the next Value and dealing with errors. Similar to not caring about what is backing an io.Reader, I don't want to have to care about what is backing the Loader.

It makes sense for interface methods such as Read to take the slice that it is to read into so that the same buffer of many elements can be re-used repeatedly. Is it expected that a []Value would have many elements and that LoadSlice would be called many times to hydrate it?

As an aside, it's good to avoid using l as a variable name. Or is that 1? Or is it I?