Configure which services are available locally/publicly when using BtK

trenslow commented 3 years ago

Is your feature request related to a problem? Please describe. I think Bridge to Kubernetes would seriously improvement my development workflow and am really excited to use it! However, due to my development environment's configuration, I currently can't use it (as far as I can tell).

My problem arises due to caveats in my development environment that conflict with both the development configurations that BtK offers.

For the integrated configuration, I run into the issue that my CD mechanism (Flux) continuously applies a deployment configuration to my K8s environment. When BtK changes the deployment of the service I'm developing, the changes get automatically reverted to the configuration that Flux knows about. This subsequently breaks BtK and the whole development flow.

This led me to try the isolated configuration. For some background: The external traffic to the service I'm developing flows as follows: ingress -> API -> my service. The isolated environment automatically clones the Ingress in my cluster, along with the API service that the Ingress sends traffic to. This service is normally protected by a WAF, which restricts traffic to the API. However, the newly cloned Ingress doesn't have this WAF, which exposes my development environment more than I'm allowed to in my company.

Describe the solution you'd like I'd generally like more configuration about which services get cloned and/or which ones are exposed locally by rewriting /etc/hosts

Describe alternatives you've considered I see a couple ways around this problem:

Telling BtK which services to clone/not clone when setting up.
In my scenario, the API service could easily be exposed to my local machine by rewriting the /etc/hosts file, as is done with the service I'm developing.
If the Ingress and API services must be cloned, make them the same type as the original service, e.g. a NodePort service is cloned to a NodePort, LoadBalancer to a LoadBalancer, etc. That way the WAF would (in theory) also be applied to new ephemeral ingresses.

Additional context

The development cluster is running in Google Cloud. Google Cloud requires that services protected by a WAF are linked to the Ingress with a Service of type NodePort, not a ClusterIP. This is why the WAF isn't applied to my API service, because BtK clones its service as a ClusterIP instead of a NodePort.

Either way, keep up the hard work on this extension! I consider it to be the future of microservice development!

lolodi commented 3 years ago

Hi @trenslow! Thanks for contacting us. We are happy to hear your excitement about Bridge :) I agree that in your scenario, because of Fluxm, Isolation would be the best method of using Bridge.

Bridge automatically makes available locally all the services in the same namespace, if other services are needed you can use the KubernetesLocalProcessConfig.yaml. I believe that if your API service is in the same namespace it should already be mapped to a local IP by Bridge, but I'm not sure that this will help you to get the requests routed correctly, and more importantly to keep the security of the WAF.

I created a work item to track the work required to support different types of services (NodePort vs ClusterIP) and WAF integration. We'll keep you posted once the fix gets deployed.

In the mean time, to get unblocked, I would suggest to use Bridge without Isolation (to avoid opening your APIs without WAF) and to use a different CD system that does not keep restoring deployments. In this way to only thing that would change in your deployment is the image running your service but the rest of the infrastructure would stay the same.

trenslow commented 3 years ago

Hi @lolodi, Thanks for the prompt response. Unfortunately I'm not able to disable Flux, nor our deployment approach.

If I understand correctly, the reason why my API's service isn't exposed locally is because the Service is of type NodePort, correct? More for my curiosity than anything. I'm also quite sure that the WAF would come automatically due to how it's implemented in GKE (defined in a BackendConfig object, which is then referenced in the annotations by the Service), so I'm not sure if you need to implement anything on your side for WAF integration, at least on GCP.

If I use the KubernetesLocalProcessConfig.yaml, does BtK then only expose those services? Or is it more meant for services in other namespaces? If it just exposes the services defined in the config file and no others, I'd be unblocked. And I guess it would be a temporary solution for the NodePort problem too. If it still exposes the other services in the namespace I'm developing in, then I'm still stuck.

lolodi commented 3 years ago

Hi @trenslow, I believe there are couple of problems happening:

You would like a finer control over what is exposed in your local machine. The way Bridge works it automatically lists all the services, that are in the same namespace as the one being debugged, and starts something similar to a port forward for everyone of them, so that they are available to your code running locally on your dev box. The KubernetesLocalProcessConfig.yaml allows you to define additional services, e.g. services that are in a different namespace or services that are even outside the cluster, but that need to be called from within the cluster. Usually having all the services of the current namespace mapped locally is not a problem because the "port forward" is available only on your machine and we are not opening up anything in the cluster.
When Isolation is turned on, instead of replacing the container running in your pod with our agent, we deploy our agent in a brand new pod. We then clone (and modify) the ingresses and services so that depending on the host prefix (or kubernetes-route-as header) we can route the request coming to your cluster to the appropriate instance, i.e. https://docs.microsoft.com/en-us/visualstudio/containers/overview-bridge-to-kubernetes?view=vs-2019#using-routing-capabilities-for-developing-in-isolation It looks like that in your scenario your service is of type NodePort (in order for WAF to work) but Bridge is currently not supporting this, but I created a work item to track this.

Once we'll support services of type NodePort you should be unblocked because the cloned ingresses and services should be supported by WAF and traffic routing should work properly.

trenslow commented 3 years ago

Hey @lolodi thanks for all your help on this one. Will be looking out for BtK's support for NodePort Services!

One last small question: is there a way to easily clean up the workloads created by BtK if I chose to keep them running in the cluster after debugging is finished?

lolodi commented 3 years ago

Hi @trenslow, normally, after debugging, the workloads should be removed automatically and the networking restored as normal. This is done because there is no point in leaving the BtK workloads in the cluster if there is no local service running to redirect the traffic to. There is an option that allows you to keep them running between debug instances, but once you close the IDE (VS or VS Code) the workloads get restored automatically. Is your experience different from this?

trenslow commented 3 years ago

If I choose to let the workloads run between debug instances, the RoutingManager deployment and service stay in the cluster, along with the Ingress that was created.

I don't mind so much that the RoutingManager resources stick around, but I'm a bit uncomfortable leaving an extra entry point into the cluster hanging around. However if the Service that the Ingress exposes gets cloned as type NodePort, then I'd be more comfortable having the Ingress lying around, especially since it can take time to spin up on GCP.

lolodi commented 3 years ago

This is correct, as of today the additional cloned ingresses and services are already deleted once debug is completed but the RoutingManager stays running, we have a work item to track its removal once the debug is done.

The RoutingManager service is not exposed publicly and when Bridge needs to talk to it it just port forwards to its pod, using kubectl, so this should not add any additional vulnerability compared to other pods you already have running in your cluster.

For the NodePort service this is good feedback, once we'll add support to it this is something we'll keep in mind.

microsoft / mindaro

Configure which services are available locally/publicly when using BtK #112