Azure / azure-iot-operations

The official repo for Azure IoT Operations.
MIT License
20 stars 16 forks source link

[bug] aio-orc-api deployment does not use hosts proxy settings #31

Closed derbl4ck closed 5 months ago

derbl4ck commented 6 months ago

Describe the bug While deploying Azure IoT Operations on a clean installed AKS EE Instance within a corporate environment using a Proxy, the container symphony-api is not using the hosts (AKS EE Node) proxy settings. As a result the symphony-api container is not able to pull a helm chart, which is the reason why the deploy CLI-Command fails.

Steps to reproduce Run the az iot ops init CLI-command targeting a AKS EE Instance which is using a corporate proxy.

Example aksedge-config.json:

{
  "SchemaVersion": "1.9",
  "Version": "1.0",
  "DeploymentType": "SingleMachineCluster",
  "Network": {
    "NetworkPlugin": "calico",
    "Ip4AddressPrefix": "192.168.1.0/24",
    "InternetDisabled": false,
    "SkipDnsCheck": false,
    "Proxy": {
      "Http": "http://xxxxxx.sme.zscaler.net:80",
      "Https": "http://xxxxxx.sme.zscaler.net:80",
      "No": "localhost,127.0.0.0/8,192.168.0.0/16,172.17.0.0/16,10.42.0.0/16,10.43.0.0/16,10.96.0.0/12,10.244.0.0/16,.svc"
    }
  }
...
}

Hot fix To fix the Issue I added the environment variables NO_PROXY, HTTP_PROXY and HTTPS_PROXY to the symphony-api container and mounted the hostPath /etc/pki/ca-trust/source/anchors/ as volume to symphony-api container under /etc/pki/tls/certs/.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: aio-orc-api
  namespace: azure-iot-operations
spec:
  template:
    spec:
      volumes:
      …
        - name: hosts-trusted-certs
          hostPath:
            path: /etc/pki/ca-trust/source/anchors/
            type: Directory
      containers:
        - name: symphony-api
          image: mcr.microsoft.com/azureiotoperations/aio-orc-api:official-20231129.1
          ports:
            - name: https
              containerPort: 8443
              protocol: TCP
          ...
          env:
            …
            - name: HTTP_PROXY
              value: http://xxxxxx.sme.zscaler.net:80
            - name: HTTPS_PROXY
              value: http://xxxxxx.sme.zscaler.net:80
            - name: NO_PROXY
              value: >-
                localhost,127.0.0.0/8,192.168.0.0/16,172.17.0.0/16,10.42.0.0/16,10.43.0.0/16,10.96.0.0/12,10.244.0.0/16,.svc
          resources: {}
          volumeMounts:
            ...
            - name: hosts-trusted-certs
              readOnly: true
              mountPath: /etc/pki/tls/certs/

Next steps We should make sure that our helm chart is using the nodes proxy configuration and the symphony-api container trusts the proxy certificate as we cannot rely on the AKS EE proxy functionality as proved due to this bug.

Cluster Environment

Developer Environment

chgennar commented 6 months ago

Hi @derbl4ck, proxy support will be available in the near future. We plan to release this feature in the next couple of months.

derbl4ck commented 5 months ago

In the past, using an AKS EE behind a corporate proxy was only possible by manually adjusting the configuration of the deployment manifests and cluster.

Starting with the AKS EE Cluster itself, it is still not possible to pull container images until you add proxies certificate to the chain. This can be done by copying proxies .pem file or content to /etc/pki/ca-trust/source/anchors/ and run sudo update-ca-trust and sudo systemctl restart containerd. Since the Zscaler Root CAs are already added to Windows Host Certificate Chain (e.g. via Intune), those certificates should be automatically mounted to the AKS EE!

After deploying iot ops inside of the AKS EE there is the same issue again, e.g. in the "aio-orc-api"-deployment which need to have access to e.g. remote helm charts. This Issue can be fixed by mapping hosts certchain into the container and add proxy env's (as mentioned in aboves github issue).

With the january 2024 release (using AKS EE Version "AksEdge-K8s-1.26.6-1.5.203.0" and azure-iot-ops v0.3.0b1) I can confirm that the iot ops related part is fixed 🚀 (as discussed via email 😊). Still we should push the AKS EE related part to finally achieve "automatic support for corporate proxy certificates" forward to the AKS EE Team.

Moving this to https://github.com/Azure/AKS-Edge/issues/170. @chgennar please upvote the new issue.