Closed ealasgarov closed 1 year ago
I've now created a sample deployment, the pod is running, ingress is available, but for now complains about the app azure authentication, need to figure out how to connect this part, manually create a registration app for that purpose and pass client_id/pass as env var as well? Can you please help a bit with that part?
apiVersion: v1
kind: Namespace
metadata:
name: openai
---
apiVersion: v1
kind: ConfigMap
metadata:
name: openai-webapp-configmap
namespace: openai
data:
AZURE_OPENAI_RESOURCE: ""
AZURE_OPENAI_MODEL: "OpenAI_chatgpt35-turbo"
AZURE_OPENAI_MODEL_NAME: "gpt-35-turbo"
AZURE_OPENAI_KEY: "mykey"
AZURE_OPENAI_TEMPERATURE: "0"
AZURE_OPENAI_TOP_P: "1"
AZURE_OPENAI_MAX_TOKENS: "800"
AZURE_OPENAI_STOP_SEQUENCE: ""
AZURE_OPENAI_SYSTEM_MESSAGE: ""
AZURE_OPENAI_PREVIEW_API_VERSION: ""
AZURE_OPENAI_STREAM: ""
AZURE_SEARCH_SERVICE: ""
AZURE_SEARCH_INDEX: ""
AZURE_SEARCH_KEY: ""
AZURE_SEARCH_USE_SEMANTIC_SEARCH: ""
AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG: ""
AZURE_SEARCH_TOP_K: ""
AZURE_SEARCH_ENABLE_IN_DOMAIN: ""
AZURE_SEARCH_CONTENT_COLUMNS: ""
AZURE_SEARCH_FILENAME_COLUMN: ""
AZURE_SEARCH_TITLE_COLUMN: ""
AZURE_SEARCH_URL_COLUMN: ""
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openai-webapp
namespace: openai
labels:
app: openai-webapp
spec:
replicas: 1
selector:
matchLabels:
app: openai-webapp
# azure.workload.identity/use: "true"
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
app: openai-webapp
# azure.workload.identity/use: "true"
prometheus.io/scrape: "true"
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: openai-webapp
image: fruoccopublic.azurecr.io/sample-app-aoai-chatgpt:latest
imagePullPolicy: Always
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /
port: 80
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
startupProbe:
httpGet:
path: /
port: 80
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
env:
- name: AZURE_OPENAI_RESOURCE
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_RESOURCE
- name: AZURE_OPENAI_MODEL
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_MODEL
- name: AZURE_OPENAI_MODEL_NAME
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_MODEL_NAME
- name: AZURE_OPENAI_KEY
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_KEY
- name: AZURE_OPENAI_TEMPERATURE
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_TEMPERATURE
- name: AZURE_OPENAI_TOP_P
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_TOP_P
- name: AZURE_OPENAI_MAX_TOKENS
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_MAX_TOKENS
- name: AZURE_OPENAI_STOP_SEQUENCE
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_STOP_SEQUENCE
- name: AZURE_OPENAI_SYSTEM_MESSAGE
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_SYSTEM_MESSAGE
- name: AZURE_OPENAI_PREVIEW_API_VERSION
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_PREVIEW_API_VERSION
- name: AZURE_OPENAI_STREAM
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_OPENAI_STREAM
- name: AZURE_SEARCH_SERVICE
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_SERVICE
- name: AZURE_SEARCH_INDEX
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_INDEX
- name: AZURE_SEARCH_KEY
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_KEY
- name: AZURE_SEARCH_USE_SEMANTIC_SEARCH
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_USE_SEMANTIC_SEARCH
- name: AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG
- name: AZURE_SEARCH_TOP_K
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_TOP_K
- name: AZURE_SEARCH_ENABLE_IN_DOMAIN
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_ENABLE_IN_DOMAIN
- name: AZURE_SEARCH_CONTENT_COLUMNS
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_CONTENT_COLUMNS
- name: AZURE_SEARCH_FILENAME_COLUMN
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_FILENAME_COLUMN
- name: AZURE_SEARCH_TITLE_COLUMN
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_TITLE_COLUMN
- name: AZURE_SEARCH_URL_COLUMN
valueFrom:
configMapKeyRef:
name: openai-webapp-configmap
key: AZURE_SEARCH_URL_COLUMN
tolerations:
- effect: NoSchedule
key: kubernetes.azure.com/scalesetpriority
operator: Equal
value: spot
- effect: NoSchedule
key: os_type
operator: Equal
value: linux
---
apiVersion: v1
kind: Service
metadata:
name: openai-webapp
namespace: openai
labels:
app: openai-webapp
spec:
type: ClusterIP
ports:
- protocol: TCP
port: 80
selector:
app: openai-webapp
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: openai-webapp-ingress
namespace: openai
# annotations:
# kubernetes.io/ingress.class: openai
spec:
ingressClassName: openai
tls:
- hosts:
- webapp.openai.myportal.com
secretName: tls-openai
rules:
- host: webapp.openai.myportal.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: openai-webapp
port:
number: 80
For now I've disabled the authentication, just to test whether it works
here: https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/frontend/src/pages/chat/Chat.tsx amended lines 35-44 like that:
const getUserInfoList = async () => {
const userInfoList = await getUserInfo();
// if (userInfoList.length === 0 && window.location.hostname !== "127.0.0.1") {
// setShowAuthMessage(true);
// }
// else {
// setShowAuthMessage(false);
// }
setShowAuthMessage(false);
}
Now the authentication prompt doesn't appear, however the chat isn't working still due to name resolution error. It is the same error in both WebApp (which I deploy from the azure portal by clicking "Deploy to WebApp") and when I'm running it in kubernetes.
The error:
Error communicating with OpenAI: HTTPSConnectionPool(host='my-cognitive-account-chatgpt4.openai.azure.com', port=443): Max retries exceeded with url: //openai/deployments/gpt-4_0314/chat/completions?api-version=2023-03-15-preview (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f28c71e7950>: Failed to resolve 'my-cognitive-account-chatgpt4.openai.azure.com' ([Errno -2] Name does not resolve)"))
Can you suggest why am I getting that? My OpenAI account is located in FranceCentral region (because this is where GPT-4 is available), the edpoint is: https://francecentral.api.cognitive.microsoft.com
AZURE_OPENAI_RESOURCE: "my-cognitive-account-chatgpt4" -- the name of OpenAI (cognitive services account of type OpenAI) AZURE_OPENAI_MODEL: "gpt-4_0314" -- deployment name AZURE_OPENAI_MODEL_NAME: "gpt-4" -- model name AZURE_OPENAI_KEY: "8dd5xxxxxxxxxxxxxxxxxxxxxxxx" -- key which I copied from the azure portal
But how does it know where this resource is deployed, which subscription and which resource group?
aahh the problem is that I was missing a custom domain, thus my endpoint didn't look like customdomain.openai.azure.com
Now just need to be able to setup the same authentication mechanism (azure AD), as was done in WebApp, for my application running in AKS. @pamelafox can you help with that perhaps?
Ok, i've solved the authentication problem, by following these steps (a great article by the way): https://kristhecodingunicorn.com/post/k8s_nginx_oauth/#configure-nginx-ingress-controller
and yeah, i really enjoy talking to myself here... :)
@ealasgarov Sorry, we're playing whack-a-mole on OpenAI repository issue trackers right now. Here's a write-up of how I enabled AAD for this repo, if it helps:
For sample-app-aoai-chatGPT, I automated the process of creating an app registration and protecting the app service with that app with a combination of hooks and Bicep.
The hooks are declared here:
https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/azure.yaml
For the pre provision hook, auth_init.sh calls auth_init.py:
https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/scripts/auth_init.py
That script makes REST API calls to https://graph.microsoft.com/v1.0/applications in order to create a new app registration. It then sets AUTH_APP_ID, AUTH_CLIENT_ID, and AUTH_CLIENT_SECRET.
For the provisioning step, AUTH_CLIENT_ID and AUTH_CLIENT_SECRET are passed in main.parameters.json:
https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/infra/main.parameters.json
Those parameters get passed into the appservice module here:
That appservice.bicep module adds the identity provider here:
For the post provision hook, auth_update.sh calls auth_update.py:
https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/scripts/auth_update.py
That code makes a REST API to update the redirect URIs for the registered application to include the deployed URL endpoint.
This all works great locally! However, it doesn't work on CI/CD as the pipeline principal doesn't have the permission needed to create an application registration.
Thanks for the reply Pamela, much appreciated!
I guess I've now sorted everything out and things seem to work, including azure authentication, except for one problem - each time after the first successful answer, I'm getting this error when asking a 2nd question:
Error
Requests to the Creates a completion for the chat message Operation under Azure OpenAI API version 2023-03-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 7 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.
I cannot replicate this on OpenAI studio Playground with the same deployment/model, there I can ask 10 questions one after another and everything works fine. Not sure why is that... But I will the open a separate issue for that one.
imo it would be great to just pass the following environment variables if possible:
AUTH_TENANT_ID
AUTH_CLIENT_ID
AUTH_CLIENT_SECRET
what do you guys think?
@ealasgarov I am having issue still on this, can you share your ingress files and deployment files please if possible
I still having issues with deploying this in kubernetes after following the article you shared as well
@sarah-widder How about creating the same deployment in AKS, should a separate solution be created for that or can this repo be extended to include an AKS deployment as well? This should be not difficult to implement, given there's already a dockerfile ready with all the dependencies. I'm not completely sure about the authentication layer though, need to look closer.