Allow "fan-in" for named types in dig graph

willhug commented 7 years ago

I've been talking to @abhinav about this, In order to support use cases where a dig parameter depends on multiple inputs of the same type we need the ability to aggregate dig types together.

So if we have the following outputs:

  Output1 MyType `name:"output1"`
  Output2 MyType `name:"output2"`

I'd want to be able to grab all the "MyType"s in a single attribute.

Solutions

1: Add another tag to indicate that the type should be a fan-in collection:

type In struct {
  dig.In
  Outputs map[string]MyType `collection:"true"` // tag name TBD
}

The string key in the Outputs variable will be the name of the attributes that were injected into the graph.

2: Add a `dig.Key` (or similar) type for map keys. Any map of these types will be recognized as a fan-in collection.

type In struct {
  dig.In
  Outputs map[dig.Key]MyType
}

The dig.Key would be a dig struct where we could embed information on the type. Currently that would only be the "name", but we could add more information later.

Use cases

YARPC Middleware

Currently we don't have a good answer for middleware to ensure that we can take multiple middleware inputs and effectively construct a middleware graph without forcing the yarpcfx module to have a hard dependency on every middleware library (which gets out of control). With this system we'd be able to grab all middleware that was inserted into the graph. Since this is a map, the yarpcfx module would need to have an ordered list of middleware names it was expecting beforehand (all names should be optional) and it can construct the middleware chain itself. Of course this would only support names that yarpcfx explicitly expects, but that is necessary for the middleware ordering.

YARPC Specs:

The yarpc config options allow inserting custom Transport/PeerList/Updater specs into the yarpc config builder. Having support for this would allow us to detach the yarpc fx module from the implementations of each transport. For instance, a user could inject a "grpcfx" module into the graph which provided a "TransportSpec" for grpc that integrated seamlessly into the yarpc configuration.

HelloGrayson commented 7 years ago

The point of the map is to ensure the end-user doesn't attempt to rely on ordering right?

This feature is normally called service tags in other container implementations.

I think I would rather call them "group", since I've seen "tags" be confused with "named".

type In struct{
  dig.In
  Middleware map[dig.Group]yarpc.Middleware
}

HelloGrayson commented 7 years ago

@willhug re:yarpc needs ordered middleware - wouldnt the collection be unordered intentionally? How would yarpc know which order to apply the middleware?

This brings up the same convo I had with @bayesianmind about this -

Either YARPC explicitly orders the middleware and lets users install before or after OR middleware themselves should have a key, ex authn or stats, and be able to declare which keys they should be installed before or after.

willhug commented 7 years ago

Indeed, putting this info into a map ensures that people won't get confused about ordering.

Totally down for using dig.Group.

The collection is unordered, but each of the middlewares already has a "key" from the name field, yarpc would have an internal understanding of ordering in some form of list:

var middlewareOrder := ['stats', 'authn'...]

willhug commented 7 years ago

To clarify, for the middleware case, this solves the "core" middleware ordering, we've already got some designs about how to integrate custom middleware beyond that.

willhug commented 7 years ago

It's also conceivable that if users did inject custom middleware into this system, they could create explicit ordering for the middleware through a config for the name that yarpcfx could read:

middlewareOrder:
  - stats
  - authn
  - myMiddleware

bayesianmind commented 7 years ago

Asking customers to specify the ordering quickly turns into a nightmare. Incorrect ordering can lead to subtle, hard to debug problems. Users are the least knowledgable about what the order should be and giving them that decision just increases the chances for error and your oncall load from customers with difficult problems arising from ordering. Likewise, YARPC does /not/ want to be in the business of having to maintain a 'master ordering list' for all possible middlewares out there. Instead, the best solution is for middleware authors to tell us their constraints. They know the best about ordering constraints and in the event they forget to put a key constraint you can fix every user of that middleware by pushing an update adding the constraint. No updates needed from users!! Middleware defines their constraints like so: "I am middleware 'authz'. I must come after 'authn' and before 'proceedures','routing'." In YARPC we then order the middlewares in any order we want that obeys all the constraints by using topological sort.

akshayjshah commented 7 years ago

Let's separate discussion of ordered collections (e.g. middleware) to #128 - it's quite a bit more complex. Here, let's focus on order-independent collections, like transport.Procedure and YARPC specs.

Re: naming, Symfony calls these service tags. Dagger and Guice call them multibindings. It doesn't seem like there's any universal naming convention across DI ecosystems.

In Go, I'd prefer to model our collections as slices with struct tags rather than maps. Today, the order of elements in the slice would be explicitly randomized. If we come up with a good implementation for #128, we can add constraints via additional struct tags without introducing a completely different API.

What do we think about this API:

type Params struct {
  dig.In

  AllProcs [][]transport.Procedure `group:"foo"`
}

type Result {
  dig.Out

  SomeProcs []transport.Procedure `group:"foo"`
}

abhinav commented 7 years ago

@akshayjshah The proposed API looks good to me

bayesianmind commented 7 years ago

I think there are two different features this brings up: the first is 'multibindings', the ability to have a set built up by multiple provides.

The second feature is the ability to get all the named instances for a particular type. For that, we'd want to know the named type. Sample usecase: get all myservice.Interface types, check their name, and verify there is clientConfig for them.

@willhug's map API satisfies both asks, but gives up named groups. Need to guarantee unique names to make it work. @akshayjshah's grouped slice API allows different groupings of the same type. I find this API more elegant because of the unique names problem of the first proposal.

HelloGrayson commented 7 years ago

I like what @akshayjshah put up.

Now, after thinking on it more, I actually think the "tag" terminology fits better:

Procs []transport.Procedure `tag:"yarpc.procedures`

So here we're tagging the procedure as yarpc.procedures (note the prefix and dot separator).

This feels good in docs to: the yarpc module will collect all types tagged with yarpc.procedures.

bayesianmind commented 7 years ago

+1 tag is nice, I like that

akshayjshah commented 7 years ago

Sure, tag sounds good.

@breerly are you proposing yarpc.procedures as a convention, or are you proposing some logic that parses tag names and splits on dots? Just tagging the procedures with tag:"yarpcfx" seems sufficient to me, since the "procedures" bit is already part of the type.

HelloGrayson commented 7 years ago

It's a convention - the string is in a global space, so the dot notation is a good way to namespace.

These things tend to explode, so its good to have the convention up-front: https://symfony.com/doc/current/reference/dic_tags.html

You don't actually do anything with the dot (like parse it for example), it's just a string on both ends.

HelloGrayson commented 7 years ago

re: yarpc vs yarpcfx, i don't have a preference.

ghost commented 7 years ago

Can we squeeze everything out of name first? It would be really nice to have something like this:

type struct Params {
  dig.In

  NamedSlice    []T `name:"A,B"
  VariadicSlice []T `name:"A,..."
  NeatSlice     []T `name:"A"
}

where

NamedSlice is going to have only instances of T with the name A and B.
VariadicSlice is going to have all instances of T and A is going to be first.
NeatSlice is going to have instance of []T with the name A We can use _ or default tags for unnamed instances. Very similar procedure could be done for dig.Out, but we can skip it first.

akshayjshah commented 7 years ago

It's a convention - the string is in a global space, so the dot notation is a good way to namespace.

I'm not familiar with the internals of Symfony's container, but I'm having trouble understanding why we'd want to implement tags this way in a strongly-typed language. What are the benefits of a global namespace? Just like we currently do for names, I'd want us to ensure that only the two-tuple of (type, tag) must be unique.

akshayjshah commented 7 years ago

@alsamylkin Interesting, I like the idea of getting as much out of name as possible! I thought about this a bit before suggesting a new tag, and ran into some issues I couldn't think of a nice way around - maybe you've got some better ideas.

How should we handle forward-compatibility for a collection with only one element? If I only care about the io.Writer called foo today, but I might want more writers in my collection later, how can I express this in a semver-friendly way?

type Params struct {
  dig.In

 Writers []io.Writer `name:"foo"`
}

doesn't clearly indicate whether I'm looking for slice foo or instance foo.

Does handling the edge cases here require treating name differently in params and result structs? One really nice feature of the current tags are that their meaning is the same in both cases.

ghost commented 7 years ago

@akshayjshah I was thinking about about slices to have comas every time they are combined out smaller pieces. So to get a slice of one element we'll write something like that:

Back  []T `name:"once,"`
Front []T `name:",once"`

Where Front and Back are going to have only one element. Using only single comma could be too subtle and lead to mistakes. Type dependency should protect us: two elements T and []T with the same name should be a rare case. Note, that this is only needed for input params for constructors/invokes, not for the output of constructors.

We also can have expansion via ... after a name:

type Params struct {
  dig.In

  // Slice that using expansion of an instance of []T named A, then instance of T named B then everything else.
  Expand        []T `name:"A...,B,..." 

  // Combine an unnamed instance of []T in the graph with B
  ExpandDefault []T `name:"_...,B"

  // Combine a default instance of T in the first place with expansion of a default instance of []T
  DefaultJumbo  []T `name:"_,_..."    
}

Another possibility is to use a golang syntax for the slices:

type Params struct{
  dig.In

  Named    []T `name:"[]{A,B}"    // Slice with 2 elements of type T in order A, B
  Variadic []T `name:"[]{A,...}"  // Slice with all elements with type T, where A is first and then the rest
  Neat     []T `name:"A"          // Slice type with the name A(backward compatible) 
  Single   []T `name:"[]{A}"      // Slice of a single element T named A
  Default  []T `name:"[]{_}"      // Slice of a single unnamed/default element T
  Expand   []T `name:"[]{A...,B,...}" // Slice that using expansion of an instance of []T named A, then instance of T named B then everything else.
}

This resembles go slices and is bullet proof from mistyping/misunderstanding of field tags. We can go even further and combine all elements in a slice, without writing any line of code for a dig.Out struct:

type struct Result {
  dig.Out

   // Slice with the name J that consist of element B and everything else..
  Jumbo []T `name:"J,[]{B,...}"`
}

akshayjshah commented 7 years ago

Interesting. To me, creating a mini-language like this introduces an undesirable level of complexity; I'd much rather just use a separate tag.

Let's see what others think.

HelloGrayson commented 7 years ago

I think I agree - tags seems within reach and will work for most cases; the mini-lang might explode in complexity.

ghost commented 7 years ago

Oh, the minilang is needed only if comas are not enough :)

HelloGrayson commented 7 years ago

Parsing is a slippery slope - I'd be happy to introduce a new concept to avoid that :D

uber-go / dig