solo-io / gloo

The Feature-rich, Kubernetes-native, Next-Generation API Gateway Built on Envoy
https://docs.solo.io/
Apache License 2.0
4.07k stars 437 forks source link

Stale Reads Overwrite across namespaces in Consul #6343

Open tmartinsolo opened 2 years ago

tmartinsolo commented 2 years ago

Version

No response

Is your feature request related to a problem? Please describe.

https://github.com/solo-io/gloo/blob/4f133dd2be0875463754fecc84c1eced7d4202fd/projects/gloo/pkg/plugins/consul/plugin.go#L77

This should look like (psuedo-code kinda, i know options doesn't exist) instances, _, err := p.client.Service(spec.GetServiceName(), "", &api.QueryOptions{Datacenter: dc, RequireConsistent: options.RequireConsistent, UseCache: options.UseCache, AllowStale, options.AllowStale})

Some of the options in that instruct link are not very useful because the client has sensible overrides (e.g.: Token, Datacenter) and some are almost never used and should only be overwritten on specific endpoints for very good reasons (like WaitHash/Near).

Describe the solution you'd like

(https://solo-io-corp.slack.com/archives/C031NRKP1UL/p1650045977749209) https://github.com/hashicorp/consul/blob/main/api/api.go#L101 if you wanna get fancy for enterprise folks, supporting Namespace would be a big win too. api.go type QueryOptions struct {

plugin.go instances, _, err := p.client.Service(spec.GetServiceName(), "", &api.QueryOptions{Datacenter: dc, RequireConsistent: true})

would this belong under consulUpstreamDiscoverySettings maybe - like u.consulUpstreamDiscoverySettings.RequireConsistent?

Anywhere that calls like p.client.Health or p.client.Services will want to pass the same api.QueryOptions block. Typically when doing this you instantiate the api.QueryOptions struct and just pass it around.

Some of the options in that struct linked above are not very useful because the client has sensible overrides (e.g.: Token, Datacenter) and some are almost never used and should only be overwritten on specific endpoints for very good reasons (like WaitHash/Near)

if we could include hitting the .Services() api and then allowing filtering by tag (v1/catalog/services) "foo": [ "app_metrics", "product=foo", "traefik.frontend.entryPoints=http", "traefik.frontend.passHostHeader=true", "traefik.frontend.rule=Host:asdf", "traefik.enable=true", "traefik.tags=http", "traefik.backend.loadbalancer.method=wrr", "traefik.backend.buffering.retryExpression=IsNetworkError() && Attempts() <= 2" ], "bar": [ "traefik.frontend.entryPoints=http", "traefik.frontend.passHostHeader=true", "traefik.frontend.rule=Host:asdf", "traefik.backend.loadbalancer.method=wrr", "traefik.backend.buffering.retryExpression=IsNetworkError() && Attempts() <= 2", "app_metrics", "traefik.enable=true", "traefik.tags=http", "product=bar" ], "baz": [ "asdf", "st" ], this is how some customers set traefik rules today, and having something similar to being able to filter using something similar to traefik.enable=true would be useful.

this is from curl https://consul/v1/catalog/services

https://github.com/Roblox/traefik/blob/v1.7_roblox/provider/consulcatalog/consul_catalog.go#L504-L517 this is the function they use at the moment

https://github.com/Roblox/traefik/blob/v1.7_roblox/provider/consulcatalog/consul_catalog.go#L217-L226 and here's how/where it's called https://github.com/Roblox/traefik/blob/v1.7_roblox/provider/consulcatalog/consul_catalog.go

if you have a filtering system in-place for other areas (like filtering k8s services) that has a more generic system please of course use that.

Describe alternatives you've considered

The other option which would potentially flip how you use consul upside down would be to allow opt-in services -- so like having a catalog of 'opted-in' services in solo's data store which is fetched/used every poll period, so something like glooctl enable service pet-store maybe?

Additional Context

No response

github-actions[bot] commented 3 months ago

This issue has been marked as stale because of no activity in the last 180 days. It will be closed in the next 180 days unless it is tagged "no stalebot" or other activity occurs.