PaloAltoNetworks / prisma-cloud-compute-operator

15 stars 22 forks source link

Console Deployment Fails in Some Environments due to Abbreviated Service FQDN #33

Open eshaanm25 opened 2 years ago

eshaanm25 commented 2 years ago

Describe the bug

API Requests to the internal Prisma Console Service in the Ansible Playbooks for the Prisma Operator do not use the service's FQDN. In some enterprise environments, DNS resolution is not completed through kube-dns unless the the service is appended with .svc

Expected behavior

When an installation of a Console (or Console + Defender) is made in an enterprise OpenShift environment, the operator should be able to reach out to the console's internal service and recognize that a console has started up, add admin users, inject license, etc.

Current behavior

When a Console (or Console + Defender) is installed in an enterprise OpenShift environment, the installation hangs at the Wait for Console to start up build step.

Some enterprise environments automatically add environment variables for HTTP_PROXY, HTTPS_PROXY, and NO_PROXY to Kubernetes workloads so traffic is routed through their enterprise proxy. To ensure that internal traffic destined for the kube-proxy is not routed through the enterprise proxy, NO_PROXY includes .svc and cluster.local. This means that domains that are appended with these values do not get routed through the enterprise proxy.

The Ansible Playbook, when pinging the internal console service through https://twistlock-console.{{ namespace }}:8083, does not callout that it is attempting to reach out to an internal service. As such, the traffic is routed by rules defined by HTTP_PROXY and HTTPS_PROXY and subsequently fail.

Possible solution

When defining Services in Ansible Playbooks for Console Deployment and Console+Defender Deployment, append .svc to URLs to denote that traffic must be routed through the internal kube-proxy rather than rules defined by the HTTP_PROXY/HTTPS_PROXY environment variables.

For example:

https://twistlock-console.{{ namespace }}:8083/api/v1/_ping -> https://twistlock-console.{{ namespace }}.svc:8083/api/v1/_ping

Steps to reproduce

  1. Set HTTP_PROXY and HTTPS_PROXY environment variables when deploying an operator image
  2. Observe that Console/Console + Defender Deployments fail at the Wait for Console to start up step

Context

We cannot properly deploy a Prisma Console with the Operator in its current state due to this issue. The playbook is unable to create an admin user and add our license key because the operator can't access the console through it's internal service.

Your Environment

welcome-to-palo-alto-networks[bot] commented 2 years ago

:tada: Thanks for opening your first issue here! Welcome to the community!