hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.39k stars 4.43k forks source link

Provide the minimal connect setup in non-connect endpoints. #9744

Open pierrecdn opened 3 years ago

pierrecdn commented 3 years ago

Feature Description

Provide the minimal connect setup in non-connect endpoints.

Use Case(s)

At integrating Consul-Connect natively in a load-balancer, I'm struggling with the existing endpoints.

When watching endpoints to provision all services, including connect services, I have two options:

I was wondering why all the connect config is actually removed from the newly registered service.

To me, the connect endpoint gives a "connect-first" approach, where for ex. the Service.port is the connect one.

  "Service": {
      "Kind": "connect-proxy",
      "ID": "foo-bar-sidecar-proxy",
      "Service": "foo-bar-sidecar-proxy",
      "Port": 21001,

But still you have the downstream information available:

   "Proxy": {
        "DestinationServiceName": "foo-bar",
        "DestinationServiceID": "foo-bar",
        "LocalServiceAddress": "127.0.0.1",
        "LocalServicePort": 80,
        "MeshGateway": {},
        "Expose": {}
      },

Proposal

Can't we have the opposite as well with /v1/health/service? Such as:

   "Connect": {
        "ProxyServiceName": "foo-bar-sidecar-proxy",
        "ProxyServiceID": "foo-bar-sidecar-proxy",
        "ProxyServiceAddress": "192.0.2.1",
        "ProxyServicePort": 21001,
      },

This would allow an event consumer to guess that a connect equivalent exists, and to grab its IP+port directly. Maybe other properties would be interesting as well, but I've no use-case for now to identify which of them would make sense.

pierrecdn commented 3 years ago

Mentioning @banks here, since I think you have the whole context (and found you by git-blaming).

rboyer commented 3 years ago

Hi @pierrecdn,

I had a chance to dig into various approaches to implement something like what this issue describes in a way that is compatible with various constraints that exist within consul and have a PR up with a proposed implementation: https://github.com/hashicorp/consul/pull/9959

For this to work, I'm inferring that the reason that you (and traefik) need something like this is to manage a front-proxy/ingress/north-south-proxy setup that bridges traffic into the private network and needs to do that bridging differently for plain services vs services within the service mesh.

Given that reinterpretation of the problem, the gist here is that rather than trying to try and join up the proxies (etc) with the underlying services and figure out how to return that in a digestible format that works in all cases, instead we make a special endpoint that either returns the v1/health/service results or returns the v1/connect/service results for a given service. That way you only have to make a request to the new endpoint and the results immediately tell which which flavor of service you have to route traffic to.

Please take a look and see if this proposed implementation (or something like it) would actually work for your circumstances.

pierrecdn commented 3 years ago

Hi @rboyer, Sorry for the delay.

For this to work, I'm inferring that the reason that you (and traefik) need something like this is to manage a front-proxy/ingress/north-south-proxy setup that bridges traffic into the private network and needs to do that bridging differently for plain services vs services within the service mesh.

I don't really know about traefik, but in my case I simply want to consume a unified view for an in-house control-plane. For this I especially need to know on which IP and port end-users (or other systems) should connect to. And when the connect setup is in place, I didn't have any other solution than consume both, aggregate, etc. with all the associated constraints. So, knowing that there was a notion of "upstream" between the proxy and the service, I started thinking about a notion of "downstream" between the service and the proxy to represent the thing.

I'm fine with this approach if it's more consistent for you, as it should allow me to do the exact same!

pierrecdn commented 3 years ago

Hi @rboyer, At reading your answer last time, I thought this was actually ready to be integrated into the 1.10. Sadly it's not :cry: What is missing for #9959 to be actually reviewed?

pierrecdn commented 3 years ago

@rboyer @banks ping, what is happening here?

pierrecdn commented 3 years ago

@rboyer @banks gentle ping, do we have something blocking here?

pierrecdn commented 2 years ago

Is this project still alive and maintained? Should we consider moving to something else right now?

It's really annoying to never have any feedbacks. The feature apparently had some interest, someone from hashicorp took the ownership for a PR, then the whole thing is dying without any context or explanation since 7 months now.

Amier3 commented 2 years ago

Hey @pierrecdn ,

Apologies for the extremely late response. In trying to balance different priorities along with the engineering team moving around, it seems like this issue fell through the crack.

I was able to get in touch with the engineer that originally took ownership of the PR and I got some context into why this was delayed for so long. It looks like shortly after the PR was opened, streaming was refactored. This meant that our team would have to re-think the approach and then redo the entire code. This is the point where we should've communicated that with you but unfortunately we did not.

I understand and agree that the lack of communication from us is frustrating , that's why we've recently hired technical community managers ( I'm the technical community manager for consul ) to facilitate the interactions between the engineering team and the community so issues and PRs like this don't get left unresolved.

Hopefully we'll be able to gain back your trust, and we're willing to work on this issue if given the time to reassess a solution after the holidays. If you'd like to talk further about our efforts to improve the open source community, feel free to respond here or email me at amier.chery@hashicorp.com

pierrecdn commented 2 years ago

Hi @Amier3, Thanks for this response. We're building and running a large service platform around Consul at Criteo. We hope such change can be addressed soon to fix the consistency issues we experience with the existing endpoints. We can participate to this effort by the way, but only Hashicorp has the whole picture of all changes in progress and their design implications.

Amier3 commented 2 years ago

@pierrecdn

That sounds like a really cool application of consul! Most of our engineers are getting back from holidays today, so we'll start the conversation on this issue today and see how we should approach it this time. We'd definitely welcome your participation and i'll let you know how we can collaborate on this when we get a better view of the requirements.

EDIT: I removed a request to post your consul story on our forum after learning we've already done a case study on criteo haha