metalbear-co / mirrord

Connect your local process and your cloud environment, and run local code in cloud conditions.
https://mirrord.dev
MIT License
3.76k stars 103 forks source link

Support filtering IPs for outgoing traffic #702

Closed eyalb181 closed 1 year ago

eyalb181 commented 1 year ago

We should let users enable outgoing traffic, but then specify a list of IPs for which outgoing traffic would remain local.

Motivation

We want to let users mix and match resources to be used, locally and remote.

Implementation

outgoing config should support the field remote and local, each overriding the default behavior of the general setting. The value for these fields will be a list of items with the following format: [PROTOCOL://]IP/[subnet]:PORT HOST:PORT :PORT HOST

For example:

remote = ["udp://1.1.1.0/24:1337", "1.1.5.0/24", "tcp://google.com:53", "google.com", ":53"]
local = ["localhost"]

will lead to the following behavior:

  1. Any TCP connection to 1.1.1.0/24 subnet with destination port 1337 will go from the remote
  2. Any connection to 1.1.5.0/24 subnet will go from the remote
  3. Connection to google.com resolved DNS will go from the remote if TCP (need dynamic management here probably).
  4. Connection to any IP with any protocol with port 53 will go from remote
  5. Any connection to localhost or it's resolved IPs will be local.

Details

This issue should solve the following use cases:

  1. DNS Resolving
  2. TCP connections
  3. UDP

Warnings

This feature has some sharp edges that might hurt, those edges come to mind:

errietta commented 1 year ago

This would be great to have. We have an external API we use, and we have to allow list specific IPs. So we've allowlisted our k8s cluster. I want to be able to call that API, but the code will also write to the db, and I want to write to my local db not the remote one

mentos1386 commented 1 year ago

My usecase is https://www.prisma.io/ which starts an "engine" that listens on a random port (it seems to me). Which makes it impossible to use, as the "in cluster pod" might not be listening on exactly the same port 😦

aviramha commented 1 year ago

I have implemented https://github.com/metalbear-co/mirrord/pull/1154 until we get this done fully. Next version you'll be able to control if outbound sockets to localhost would go through the pod or locally, same for listening sockets.

We need to decide whether we leave this config after we add the full filtering functionality or let users use the new one. This issue doesn't cover Incoming so we should consider that too.

mentos1386 commented 1 year ago

Another usacese we have is:

We are running mirrord in docker container (A). We are running a database in a different docker container (B). We want A to be able to connect to B.

This is currently not possible. As A would route all traffic via K8S, and K8S can't access B, as B is on the developers laptop not in the K8S.

To allow us to have mixed remote and local resources that we can connect to.

meowjesty commented 1 year ago

Part of this was implemented in #1607!

Named hosts are coming in #1648.

meowjesty commented 1 year ago

I think part of DNS resolving should be done in the DNS feature itself, we could expand it to also be filter-like, and have dns: ["cluster-service". "foo.com"] resolve remotely, while everything else resolves locally.

Doing so would help in the case where the user has (outgoing) local: ["localhost"], and the app tries to connect("localhost"), but if they have DNS enabled, it resolves to some cluster address, thus breaking the request/traffic.

eyalb181 commented 1 year ago

Makes sense to me. Should we still let the user configure a hostname filter under outgoing, though? Shouldn't it be moved entirely to the dns configuration field?

t4lz commented 1 year ago

and have dns: ["cluster-service". "foo.com"] resolve remotely

I don't think we need an extra list, I think we can use the outgoing filter. DNS only matters for outgoing traffic, so all names you are connecting to remotely should be resolved remotely and all names you are connecting to locally should be resolved locally, right?

eyalb181 commented 1 year ago

Yup, my mistake, you're right

meowjesty commented 1 year ago

Putting this in the outgoing filter feels confusing to me, for example:

dns = on
local = ["service-a", "service-b"]

// service-a becomes 1.2.3.4:0
// service-b becomes 1.2.3.4:0
// service-c (which we want remote) becomes 1.2.3.4:0

dns = off
local = ["service-a", "service-b"]

// service-a becomes 127.0.0.1:0
// service-b becomes 127.0.0.1:0
// service-c (which we want remote) becomes 127.0.0.1:0

This is kinda what I expect, when I enable DNS resolving, I want things to be resolved on the agent. So changing the DNS behavior on the filter config like this:

dns = on
local = ["service-a", "service-b"]

// service-a becomes 127.0.01:0
// service-b becomes 127.0.0.1:0
// service-c (which we want remote) becomes 1.2.3.4:0

Feels a bit like the outgoing filter is going over its boundaries? Like, the 2 features enter in a bit of a contradiction (subverts expectations)? What do you guys think?

In my opinion, we should put DNS handling stuff in the DNS feature, the user would need something like:

dns.local = ["service-a", "service-b"]
local = ["service-a", "service-b"]

Is explicit better than implicit here?

eyalb181 commented 1 year ago

I think we definitely don't want the user to have to specify two lists.

  1. Note that in most cases, the dns field is going to be omitted entirely, so users aren't going to be like "but I asked for DNS to be done remotely".
  2. I'm not sure we need to support specifying outgoing requests for domain X to be done remotely, but DNS resolution for that same domain to be done locally, or vice versa

What I'd do is show a warning if the user specified DNS on/off explicitly, and also specified a hostname in the filter.

meowjesty commented 1 year ago

tal's suggestion link:

On every outgoing connection - we send out DNS queries (probably via the agent) for all the names in the outgoing filter. I think alternatively - we could avoid making any extra DNS queries and just check in the getaddrinfo hook if the queried name is in the filter - and resolve that IP in the filter if it is. That would mean the feature is slightly different - we would only filter out connections by name if the application looked them up by name, but this makes sense in my opinion.

eyalb181 commented 1 year ago

Agree with the last line, @aviramha wdyt? So getaddrinfo would read the filter and use it to decide whether to resolve locally or remotely?

aviramha commented 1 year ago

Sounds good

meowjesty commented 1 year ago

The filter should now be supporting almost everything listed in the issue, except for custom resolvers:

Go/Custom DNS resolving (using UDP instead of just libc's getaddrinfo etc) - harder to cover, need to do some sort of MITM for the resolving. Can be deferred for a follow up issue.

A little bit of stateful DNS was added, to make the filter behave nicely with a mix of DNS and other ip/port configs, not the full thing mentioned in the issue though (there is no socket<->dns state).

meowjesty commented 1 year ago

Closing this in favor of #1826