Closed warpfork closed 5 years ago
By the same logic, MapBuilder.Insert(k, v ipld.Node) error
also should have a variant taking primitive args for convenience: MapBuilder.Insert2(k string, v ipld.Node) error
. Similar naming questions apply there, then.
Is it really necessary to have TraverseBySegment(segment IntOrString)
? Or conversely, doesn't that one serve the same purpose as TraverseByString
and TraverseByIndex
combined? Seems like redundancy, or at the least, an API documentation smell.
Can you explain TraverseByNode
and Traverse
a bit more? They seem like different beasts to the other 3, perhaps inverting the control somehow? Maybe they deserve a separate name?
Re "traverse", it's the word I've been using for my single-node traversal algorithms, it makes sense to me. Although I do tend to have things called "traversals" that I operate on so that might be a difference for you - a traversal is made up of individual steps, each executed separately. Unless you're interested on coming up with a concrete Traversal
object that performs the traversal by interactive steps, then I don't see a major problem with the language here, it's just that your multi-step traversal operations are made up of of atomised traversals.
TraverseBySegment
does seem awfully redundant.
The main use of it is that Selectors return PathSegment
(aka IntOrString
) from some of their methods.
If we didn't have a PathSegment
type over there, then Selectors would have a bunch of very redundant looking methods instead. So I'm afraid it's "six of one, half a dozen of the other" -- we have something that looks a bit ugly in one place or the other and we just get to choose which. Given such a choice, I figure I'd rather put it in the place where it offers more reuse: which means making PathSegment
available in the main interfaces, rather than sequestering it off in the traversal/selector packages.
TraverseByNode
and Traverse
are different beasts but they aren't inversions at least :)
Imagine you have some types like:
type SnazzyKey struct {
frob String
noz String
} representation stringjoin {
join ":"
}
type SomeValue struct {
# ...
}
type SnazzyMap {SnazzyKey:SomeValue}
If someone uses TraverseByString
on a SnazzyMap
, the implementation has to parse that string into a SnazzyKey
. That's fine (it's certainly well-defined!) but if you already have a SnazzyKey
instance in your hands, converting it into a string just to feed it into TraverseByString
, which just parses it again... well!
Instead, by handing your SnazzyKey
(which of course implements Node
) to a TraverseByNode
method, we can do a lot less unnecessary work. The TraverseByNode
method just does a quick cast to see that it's handling a typed.Node
(cheap), asks it for its schema.Type
(very cheap), checks that its equal to its own expected key type (very cheap, literally a pointer equality check), and then goes "yep" and does the lookup.
If working with codegenerated implementations, then Traverse
with the $GeneratedTypeNameHere
-- it would come out func (*SnazzyMap) Traverse(SnazzyKey) SomeValue
in this example -- even gets to skip some of the casts and schema.Type
equality checks, since its effectively hoisted those into being the Go compiler's problem, so it's faster and cheaper yet. (Also, since it's codegen, it can return the concrete generated type too, instead of being limited by interfaces which must return a bland ipld.Node
-- which makes using it in application logic significantly more friendly to the programmer.)
re: the naming prefix overall: I wonder if just plain Get
wouldn't be better.
Terse is good. It's a common operation.
'Get' is familiar and common in many communities when describing handling maps.
This seems to look okay:
GetByString(string)
GetByIndex(int)
GetBySegment(PathSegment)
GetByNode(Node)
Get(ConcreteGeneratedType)
Alternatives to *By*
might be *Via*
, or *With*
:
GetViaString(string)
GetViaIndex(int)
GetViaSegment(PathSegment)
GetViaNode(Node)
Get(ConcreteGeneratedType)
:thumbsup: it's reminiscent of a standard map
operation in most languages which is basically what it is in an abstract sense. Also I'd prefer By
. I suspect that a newcomer would grok Get
more than Traverse
.
At the moment TraverseIndex
and TraverseString
exist but these others don't, they are just TODO items or am I looking in the wrong place?
Indeed. TODO items.
I'll probably act on this soon -- in the next couple of days. These are pretty foundational shakeups, and although fortunately they'll all be "sed refactors" in complexity, they nonetheless show up everywhere, so it's desirable to get them done before there's too many users.
In particular, it will be really good to get these done before graphsync and its transitive users start using these interfaces heavily... however, (ironically?) I'm waiting for some of the other Selectors branches to land before going ahead one way or the other.
LookupBy*
(or even just Lookup*
) might be another alternative to GetBy*
to consider.
(I started actually trying these, and when I started seeing examples... n.GetByIndex(0).GetByIndex(0).AsString()
really stutters, in my mind. Whereas -- though I can't put my finger on why -- n.LookupIndex(0).LookupIndex(0).AsString()
seems to read just fine.)
Fully worked:
LookupStr(string) (Node, error)
LookupIndex(int) (Node, error)
LookupSegment(PathSegment) (Node, error)
LookupNode(Node) (Node, error)
Lookup($GeneratedTypeKey) ($GeneratedTypeValue, error)
There's one more awkward bit in this set, though. LookupString
and LookupIndex
and LookupSegment
all sound fine; but LookupNode
sounds odd. I think this is because "Node" is also of course the return type, which adds some awkwardness to the vibe. LookupByNode
doesn't have the same weird vibe, but a combo-breaker of the pattern would be also bad. So, a different idea entirely for name of LookupNode
might make this set nicer. (LookupTyped
? Not entirely accurate, because it certainly wouldn't require typed.Node
as a param type; but also, it's the reason it exists and one would probably prefer LookupString
in any other usecase. Other ideas, maybe? Perhaps changing tack and giving the generated code a longer name, and saving Lookup
for the method working on Node?)
Okay, one more take:
LookupString(string) (Node, error)
LookupIndex(int) (Node, error)
LookupSegment(PathSegment) (Node, error)
Lookup(Node) (Node, error)
Get($GeneratedTypeKey) $GeneratedTypeValue
This solves the weird vibe by replacing LookupNode
with just plain Lookup
(abandoning the earlier thought that "we should reserve the shortest name for the codegen'd path, which is where all the happiest UX will happen anyway")...
And then it also solves the codegen'd name question by taking it a different direction. Notice the return tuple also is different for that one: it doesn't need to return a distinct error value. There's no need for errors like ErrInvalidKind
on a codegen'd method like that; and we can always represent "not found" unambiguously (either as a native nil (because even if the map value has the nullable
modifier, those return a null Node, not a native nil), or as ipld.Undefined
(though I learn towards the earlier option; Undef
seems best if reserved for describing struct fields)). So, this structural difference in the method signature gives us justification for a different name pattern entirely (like jumping over to Get
or something). I've put Get
here; but codegen details are questions for "later" anyway, so we can schedule this for a revisit when it comes up again.
Retracted LookupStr
shortening in favor of LookupString
. It seems un-gopherly to try to shave three letters in an exported interface.
I think I'm actually completely satisfied with this one.
(Some of the "By" phrasing might still show up in other places -- e.g. MapBuilder.InsertByString
-- since in those cases, well, you can see how eliding the "By" might confuse things. That's a shortcut method we haven't actually added yet, though.)
Thanks to @dmarby for some conversation outside of the githubs which helped refine this!
(... much later ...)
One more reneg on this has now occurred. :see_no_evil:
Eliding the "By" from the name turned out to be confusing, and I now regard that as having been a mistake. Also, using "Get" for codegen and "Lookup" for everything else turned out to be confusing. Both mistakes are being rectified.
The final result is:
LookupByString(string) (Node, error)
LookupByIndex(int) (Node, error)
LookupBySegment(PathSegment) (Node, error)
LookupByNode(Node) (Node, error)
Lookup($GeneratedTypeKey) $GeneratedTypeValue
These will become the exported interfaces in v0.5
, which should be tagged presently.
(Happily, all the changes are just renames. Semantics stay the same!)
The
ipld.Node
interface currently has a methodTraverseField(key string) (ipld.Node, error)
. This has been sufficient to get going, but there are several issues with this: it's verbose; it always takes a bare string (which is sometimes not going to be ideal); and the name "field" is usually only used in the context of structs, but we're also using it here in describing maps, which is thus awkward. Should we rename this? Furthermore, should we have several methods here?It would seem we want all of these methods at some point, and since Go eschews parametric polymorphism, they all have to have distinct names:
TraverseByString(fieldname string)
-- for use with mapsTraverseByIndex(idx int)
-- for use with listsTraverseBySegment(segment IntOrString)
-- for use when you've got pathsegments (selectors do this rather than the above two)TraverseByNode(node Node)
-- for generic use that's not segment-powered (retainstyped.Node
reference if applicable: preferable if you've got a struct-represented-as-string as a map key, or anything that has aValidate()
method on it that would need to be reapplied if we forgot about associated type info).Traverse(x ConcreteGeneratedType)
-- for use when codegen is in flight (similar advantages to TraverseByNode).If that above fivesome goes through, the first four would on the Node interface now (contrast currently only two) (the fifth is only for codegen).
Questions:
I've been thinking about this (but pushing it off) for a while now, but if something is to be done here, it should probably be a clean break soon, before we start getting too many additional things depending on this API.