hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.4k stars 4.43k forks source link

Defining a Service Splitter causes ACLS to be ignored - Consul 1.8.0 #8229

Open idrennanvmware opened 4 years ago

idrennanvmware commented 4 years ago

Overview of the Issue

When defining a service splitter, acls are no longer respected.

Reproduction Steps

  1. Make sure that Consul is set up with ACLS and Default Deny

  2. Set up ACL Intentions to allow services that are about to be deployed intentions.txt

  3. Run the nomad job attached against a cluster with ACLS enabled and default deny. fake-service.nomad.txt

  4. Service Should function as expected when you hit the endpoint and the call chain should show.

  5. Change any of the ACLS to "deny" that were configured above

  6. Verify the URL now has an error in the call chain on the component you set to deny. Set the ACL to allow and the component should function

  7. Add Splitters and resolvers resolvers.txt

splitters.txt

  1. Verify URL still shows full chain (it should)

  2. Change the ACL back to deny (that was tested in step 5).

  3. Hit the URL (note that the ACL is not being applied and the step that failed in 5, that should fail here, is still succeeding)

  4. Remove the splitter via the CLI, for example:
    ./consul config delete -kind service-splitter -name fake-service-database

  5. Verify the url again - ACLs are now being respected again.

I am unsure why or what it is about splitters that cause this behavior but we only see it when adding splitting to our services.

blake commented 4 years ago

Hi @idrennanvmware,

Thank you for providing such a detailed description of the problem.

I have not yet had an opportunity to try to reproduce this issue. However, I suspect you may be running into issue hashicorp/consul#6454. Envoy makes use of HTTP connection pooling. The previously authorized connections have probably not yet expired from the pool & are being re-used which is why it appears traffic is still being allowed by the intention.

Could you test whether connections are properly denied by the intention if you restart the source or destination proxy, or wait some amount of time between requests so that the connection can timeout?

idrennanvmware commented 4 years ago

Hi @blake - thanks for the fast response. Looks like you may be correct.

I let the system sit for a few minutes (approximately 5) and the ACL change still did not show.

When I restarted the proxy - the intentions were applied correctly.

One thing I did notice. In my case I blocked Backend->Database.

Restarting the DB Proxy had no effect on the ACLs and they were still open when they should not be. When I restarted the proxy on Backend - then the deny took effect again

idrennanvmware commented 4 years ago

@blake - did a little more testing around this. I've found if I do no resolvers, defaults, or splitters - with a mesh service - then the ACLS are instant in their application. I can switch on or off at will and it immediately shows. It's just when I start applying the L7 pieces that this then lags (or requires restarts)