Running the same container in AKS instead of WebApp

ealasgarov commented 1 year ago

@sarah-widder How about creating the same deployment in AKS, should a separate solution be created for that or can this repo be extended to include an AKS deployment as well? This should be not difficult to implement, given there's already a dockerfile ready with all the dependencies. I'm not completely sure about the authentication layer though, need to look closer.

ealasgarov commented 1 year ago

I've now created a sample deployment, the pod is running, ingress is available, but for now complains about the app azure authentication, need to figure out how to connect this part, manually create a registration app for that purpose and pass client_id/pass as env var as well? Can you please help a bit with that part?

apiVersion: v1
kind: Namespace
metadata:
  name: openai
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: openai-webapp-configmap
  namespace: openai
data:
  AZURE_OPENAI_RESOURCE: ""
  AZURE_OPENAI_MODEL: "OpenAI_chatgpt35-turbo"
  AZURE_OPENAI_MODEL_NAME: "gpt-35-turbo"
  AZURE_OPENAI_KEY: "mykey"
  AZURE_OPENAI_TEMPERATURE: "0"
  AZURE_OPENAI_TOP_P: "1"
  AZURE_OPENAI_MAX_TOKENS: "800"
  AZURE_OPENAI_STOP_SEQUENCE: ""
  AZURE_OPENAI_SYSTEM_MESSAGE: ""
  AZURE_OPENAI_PREVIEW_API_VERSION: ""
  AZURE_OPENAI_STREAM: ""

  AZURE_SEARCH_SERVICE: ""
  AZURE_SEARCH_INDEX: ""
  AZURE_SEARCH_KEY: ""
  AZURE_SEARCH_USE_SEMANTIC_SEARCH: ""
  AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG: ""
  AZURE_SEARCH_TOP_K: ""
  AZURE_SEARCH_ENABLE_IN_DOMAIN: ""
  AZURE_SEARCH_CONTENT_COLUMNS: ""
  AZURE_SEARCH_FILENAME_COLUMN: ""
  AZURE_SEARCH_TITLE_COLUMN: ""
  AZURE_SEARCH_URL_COLUMN: ""
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: openai-webapp
  namespace: openai
  labels:
    app: openai-webapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: openai-webapp
      # azure.workload.identity/use: "true"
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  minReadySeconds: 5
  template:
    metadata:
      labels:
        app: openai-webapp
        # azure.workload.identity/use: "true"
        prometheus.io/scrape: "true"
    spec:
      nodeSelector:
        "kubernetes.io/os": linux
      containers:
      - name: openai-webapp
        image: fruoccopublic.azurecr.io/sample-app-aoai-chatgpt:latest
        imagePullPolicy: Always
        resources:
          requests:
            memory: "128Mi"
            cpu: "250m"
          limits:
            memory: "256Mi"
            cpu: "500m"
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          failureThreshold: 1
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
        readinessProbe:
          httpGet:
            path: /
            port: 80
          failureThreshold: 1
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
        startupProbe:
          httpGet:
            path: /
            port: 80
          failureThreshold: 1
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
        env:
        - name: AZURE_OPENAI_RESOURCE
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_RESOURCE
        - name: AZURE_OPENAI_MODEL
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_MODEL
        - name: AZURE_OPENAI_MODEL_NAME
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_MODEL_NAME
        - name: AZURE_OPENAI_KEY
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_KEY
        - name: AZURE_OPENAI_TEMPERATURE
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_TEMPERATURE
        - name: AZURE_OPENAI_TOP_P
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_TOP_P
        - name: AZURE_OPENAI_MAX_TOKENS
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_MAX_TOKENS
        - name: AZURE_OPENAI_STOP_SEQUENCE
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_STOP_SEQUENCE
        - name: AZURE_OPENAI_SYSTEM_MESSAGE
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_SYSTEM_MESSAGE
        - name: AZURE_OPENAI_PREVIEW_API_VERSION
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_PREVIEW_API_VERSION               
        - name: AZURE_OPENAI_STREAM
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_OPENAI_STREAM

        - name: AZURE_SEARCH_SERVICE
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_SERVICE
        - name: AZURE_SEARCH_INDEX
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_INDEX
        - name: AZURE_SEARCH_KEY
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_KEY
        - name: AZURE_SEARCH_USE_SEMANTIC_SEARCH
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_USE_SEMANTIC_SEARCH
        - name: AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG    
        - name: AZURE_SEARCH_TOP_K
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_TOP_K   
        - name: AZURE_SEARCH_ENABLE_IN_DOMAIN
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_ENABLE_IN_DOMAIN   
        - name: AZURE_SEARCH_CONTENT_COLUMNS
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_CONTENT_COLUMNS   
        - name: AZURE_SEARCH_FILENAME_COLUMN
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_FILENAME_COLUMN   
        - name: AZURE_SEARCH_TITLE_COLUMN
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_TITLE_COLUMN   
        - name: AZURE_SEARCH_URL_COLUMN
          valueFrom:
            configMapKeyRef:
                name: openai-webapp-configmap
                key: AZURE_SEARCH_URL_COLUMN                                                                                                                                                                                               
      tolerations:
        - effect: NoSchedule
          key: kubernetes.azure.com/scalesetpriority
          operator: Equal
          value: spot
        - effect: NoSchedule
          key: os_type
          operator: Equal
          value: linux                    
---
apiVersion: v1
kind: Service
metadata:
  name: openai-webapp
  namespace: openai
  labels:
    app: openai-webapp
spec:
  type: ClusterIP
  ports:
  - protocol: TCP
    port: 80
  selector:
    app: openai-webapp
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: openai-webapp-ingress
  namespace: openai
  # annotations:
    #  kubernetes.io/ingress.class: openai
spec:
  ingressClassName: openai
  tls:
  - hosts:
    - webapp.openai.myportal.com
    secretName: tls-openai
  rules:
  - host: webapp.openai.myportal.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: openai-webapp
            port:
              number: 80

ealasgarov commented 1 year ago

For now I've disabled the authentication, just to test whether it works

here: https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/frontend/src/pages/chat/Chat.tsx amended lines 35-44 like that:

    const getUserInfoList = async () => {
        const userInfoList = await getUserInfo();
        // if (userInfoList.length === 0 && window.location.hostname !== "127.0.0.1") {
        //     setShowAuthMessage(true);
        // }
        // else {
        //     setShowAuthMessage(false);
        // }
        setShowAuthMessage(false);
    }

Now the authentication prompt doesn't appear, however the chat isn't working still due to name resolution error. It is the same error in both WebApp (which I deploy from the azure portal by clicking "Deploy to WebApp") and when I'm running it in kubernetes.

The error:

Error communicating with OpenAI: HTTPSConnectionPool(host='my-cognitive-account-chatgpt4.openai.azure.com', port=443): Max retries exceeded with url: //openai/deployments/gpt-4_0314/chat/completions?api-version=2023-03-15-preview (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f28c71e7950>: Failed to resolve 'my-cognitive-account-chatgpt4.openai.azure.com' ([Errno -2] Name does not resolve)"))

Can you suggest why am I getting that? My OpenAI account is located in FranceCentral region (because this is where GPT-4 is available), the edpoint is: https://francecentral.api.cognitive.microsoft.com

AZURE_OPENAI_RESOURCE: "my-cognitive-account-chatgpt4" -- the name of OpenAI (cognitive services account of type OpenAI) AZURE_OPENAI_MODEL: "gpt-4_0314" -- deployment name AZURE_OPENAI_MODEL_NAME: "gpt-4" -- model name AZURE_OPENAI_KEY: "8dd5xxxxxxxxxxxxxxxxxxxxxxxx" -- key which I copied from the azure portal

But how does it know where this resource is deployed, which subscription and which resource group?

ealasgarov commented 1 year ago

aahh the problem is that I was missing a custom domain, thus my endpoint didn't look like customdomain.openai.azure.com

ealasgarov commented 1 year ago

Now just need to be able to setup the same authentication mechanism (azure AD), as was done in WebApp, for my application running in AKS. @pamelafox can you help with that perhaps?

ealasgarov commented 1 year ago

Ok, i've solved the authentication problem, by following these steps (a great article by the way): https://kristhecodingunicorn.com/post/k8s_nginx_oauth/#configure-nginx-ingress-controller

and yeah, i really enjoy talking to myself here... :)

pamelafox commented 1 year ago

@ealasgarov Sorry, we're playing whack-a-mole on OpenAI repository issue trackers right now. Here's a write-up of how I enabled AAD for this repo, if it helps:

For sample-app-aoai-chatGPT, I automated the process of creating an app registration and protecting the app service with that app with a combination of hooks and Bicep.

The hooks are declared here:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/azure.yaml

For the pre provision hook, auth_init.sh calls auth_init.py:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/scripts/auth_init.py

That script makes REST API calls to https://graph.microsoft.com/v1.0/applications in order to create a new app registration. It then sets AUTH_APP_ID, AUTH_CLIENT_ID, and AUTH_CLIENT_SECRET.

For the provisioning step, AUTH_CLIENT_ID and AUTH_CLIENT_SECRET are passed in main.parameters.json:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/infra/main.parameters.json

Those parameters get passed into the appservice module here:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/5b311a9f74797b771dad2b515126f9ec91a3dabe/infra/main.bicep#L96

That appservice.bicep module adds the identity provider here:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/5b311a9f74797b771dad2b515126f9ec91a3dabe/infra/core/host/appservice.bicep#L103

For the post provision hook, auth_update.sh calls auth_update.py:

https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/scripts/auth_update.py

That code makes a REST API to update the redirect URIs for the registered application to include the deployed URL endpoint.

This all works great locally! However, it doesn't work on CI/CD as the pipeline principal doesn't have the permission needed to create an application registration.

ealasgarov commented 1 year ago

Thanks for the reply Pamela, much appreciated!

I guess I've now sorted everything out and things seem to work, including azure authentication, except for one problem - each time after the first successful answer, I'm getting this error when asking a 2nd question:

Error
Requests to the Creates a completion for the chat message Operation under Azure OpenAI API version 2023-03-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 7 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

I cannot replicate this on OpenAI studio Playground with the same deployment/model, there I can ask 10 questions one after another and everything works fine. Not sure why is that... But I will the open a separate issue for that one.

Breee commented 1 year ago

imo it would be great to just pass the following environment variables if possible:

AUTH_TENANT_ID
AUTH_CLIENT_ID
AUTH_CLIENT_SECRET

what do you guys think?

breddy-lgamerica commented 10 months ago

@ealasgarov I am having issue still on this, can you share your ingress files and deployment files please if possible

I still having issues with deploying this in kubernetes after following the article you shared as well

microsoft / sample-app-aoai-chatGPT

Running the same container in AKS instead of WebApp #121