Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.
https://anythingllm.com
MIT License
17.86k stars 1.93k forks source link

[DOCS]: Basic k8s kubernetes manifest. #1463

Closed atljoseph closed 2 months ago

atljoseph commented 2 months ago

Description

Thanks for the docs about deploying to various hosting providers. Can you add a little pixie dust outlining some k8s deployment specs? Docker docs help for sure, but would be cool to have a template.

timothycarambat commented 2 months ago

I'd have to delegate this to someone in the community who has more experience with doing this. We can add it to the community templates

atljoseph commented 2 months ago

I'll post something here later today... Figured it out

atljoseph commented 2 months ago

Expect pr today actually. Lost some time to aws ebs drivers.

atljoseph commented 2 months ago

We were able to make it work mostly, but something is driving me absolutely crazy. If there really is no example in the repo already, then that likely means this hasn't been done yet. In that case, I'll list the manifest that we have aas well as the symptoms. Hoping for a quick resolution, as this is a value-add all-around.

---
apiVersion: v1                                                                                                                                           
kind: PersistentVolume                                                                                                                                   
metadata:                                                                                                                                                
  name: anything-llm-volume                                                                                                                              
  annotations:                                                                                                                                           
    pv.beta.kubernetes.io/uid: "1000"                                                                                                                    
    pv.beta.kubernetes.io/gid: "1000"                                                                                                                    
spec:                                                                                                                                                    
  storageClassName: gp2                                                                                                                                  
  capacity:                                                                                                                                              
    storage: 5Gi                                                                                                                                        
  accessModes:                                                                                                                                           
    - ReadWriteOnce                                                                                                                                      
  awsElasticBlockStore:                                                                                                                                  
    volumeID: ec2-ebs-volume-uuid-from-aws-console                                                                                                                           
    fsType: ext4
  nodeAffinity:                                                                                                                                          
    required:                                                                                                                                            
      nodeSelectorTerms:                                                                                                                                 
      - matchExpressions:                                                                                                                                
        - key: topology.kubernetes.io/zone                                                                                                               
          operator: In                                                                                                                                   
          values:                                                                                                                                        
          - us-direction-123  
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: anything-llm-volume-claim
  namespace: "{{ namespace }}"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: anything-llm
  namespace: "{{ namespace }}"
  labels:
    anything-llm: "true"
spec:
  selector:
    matchLabels:
      k8s-app: anything-llm
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0%
      maxUnavailable: 100%
  template:
    metadata:
      labels:
        anything-llm: "true"
        k8s-app: anything-llm
        app.kubernetes.io/name: anything-llm
        app.kubernetes.io/part-of: anything-llm
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: /metrics
        prometheus.io/port: "9090"
    spec:
      serviceAccountName: "default"
      terminationGracePeriodSeconds: 10
      securityContext:                                                                                                                                                              
        fsGroup: 1000
        runAsNonRoot: true                                                                                                                                                          
        runAsGroup: 1000
        runAsUser: 1000
      affinity:                                                                                                                                                                                                                                                                          
        nodeAffinity:                                                                                                                                                                                                                                                                    
          requiredDuringSchedulingIgnoredDuringExecution:                                                                                                                                                                                                                                
            nodeSelectorTerms:                                                                                                                                                                                                                                                           
            - matchExpressions:                                                                                                                                                                                                                                                          
              - key: topology.kubernetes.io/zone                                                                                                                                                                                                                                         
                operator: In                                                                                                                                                                                                                                                             
                values:                                                                                                                                                                                                                                                                  
                - us-direction-123  
      containers:
      - name: anything-llm
        resources:
          limits:
            memory: "1Gi"
            cpu: "500m"
          requests:
            memory: "512Mi"
            cpu: "250m"
        imagePullPolicy: IfNotPresent
        image: "mintplexlabs/anythingllm:master"
        securityContext:                     
          allowPrivilegeEscalation: true                                                                                                                                                                                                                                                 
          capabilities:                                                                                                                                                                                                                                                                  
            add:                                                                                                                                                                                                                                                                         
              - SYS_ADMIN                                                                                                                                                                                                                                                                
          runAsNonRoot: true                                                                                                                                                                                                                                                             
          runAsGroup: 1000                                                                                                                                                                                                                                                               
          runAsUser: 1000                                                                                                                                       
        command: # Specify a command to override the Dockerfile's ENTRYPOINT.
          - /bin/bash
          - -c
          - |
            # Some debug stuff...
            set -x -e
            sleep 3
            echo "AWS_REGION: $AWS_REGION"
            echo "SERVER_PORT: $SERVER_PORT"
            echo "NODE_ENV: $NODE_ENV"
            echo "STORAGE_DIR: $STORAGE_DIR"
            # The following was taken from the Dockerfile's ENTRYPOINT, since `command` overrides that.
            {
              cd /app/server/ &&
                npx prisma generate --schema=./prisma/schema.prisma &&
                npx prisma migrate deploy --schema=./prisma/schema.prisma &&
                node /app/server/index.js
              echo "Server process exited with status $?"
            } &
            { 
              node /app/collector/index.js
              echo "Collector process exited with status $?"
            } &
            wait -n
            exit $?
        readinessProbe:
          httpGet:
            path: /v1/api/health
            port: 8888
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 2
        livenessProbe:
          httpGet:
            path: /v1/api/health
            port: 8888
          initialDelaySeconds: 15
          periodSeconds: 5
          failureThreshold: 3
        env:
          - name: AWS_REGION
            value: "{{ aws_region }}"
          - name: AWS_ACCESS_KEY_ID
            value: "{{ aws_access_id }}"
          - name: AWS_SECRET_ACCESS_KEY
            value: "{{ aws_access_secret }}"
          - name: SERVER_PORT
            value: "3001"
          - name: JWT_SECRET
            value: "my-random-string-for-seeding" # Please generate random string at least 12 chars long.
          - name: VECTOR_DB
            value: "lancedb"
          - name: STORAGE_DIR
            # value: "/data"
            value: "/app/server/storage"
          - name: NODE_ENV
            value: "development"
          - name: UID
            value: "1000"
          - name: GID
            value: "1000"
        volumeMounts: 
          - name: anything-llm-server-storage-volume-mount
            # mountPath: /data                                                                                                                                                  
            mountPath: /app/server/storage                                                                                                                                                  
      volumes:
        - name: anything-llm-server-storage-volume-mount
          persistentVolumeClaim:
            claimName: anything-llm-volume-claim

============ Symptoms: (Apologies if this is confusing. I am confused LOL)

Have dialed everything in, but getting conflicting outcomes when changing these settings:

In general, when the Persistent Volume is mounted to same dir as STORAGE_DIR (/app/server/storage):

When STORAGE_DIR is not set:

When NODE_ENV=production, no matter the value of STORAGE_DIR:

When NODE_ENV=development & STORAGE_DIR=/data:

Lastly, I cannot get this thing to forget the admin setup was performed, even after clearing all data from the persistent volume and doing rollout restart.

This is super frustrating LOL, and that is a lot of info.

But, it seems like some of the build code and structural aspects of the app might not play 100% nicely with k8s.

Anyways, I've spent 2 days on this, and I'm out of energy. Would LOVE to be able to use it.

Help @timothycarambat

timothycarambat commented 2 months ago

@atljoseph How do you use these templates? I don't use Kubernetes so I'm unsure how to even debug. For what it is worth, it may make sense to use modify

image: "mintplexlabs/anythingllm:master"

to image: "mintplexlabs/anythingllm:render". This image is special in that it pins all storage to have a mounted EBS attached to /storage. This is how we do persistent storage on Render and Railway because of the STORAGE_DIR flexibility that needs to exist for the Docker image, but cannot be so easily configured in ephemeral storage ecosystems (like K8) where containers spin up/down and get destroyed/created on the fly.

atljoseph commented 2 months ago

Thanks @timothycarambat I'll try that. Curious, how do i reset everything, including the Admin user? Doesn't seem to ever go back to the onboarding. It would revert back to onboarding early on, but can't get it to do that anymore. So, if you could tell me more about where all that is stored, it might help track down the issue.

atljoseph commented 2 months ago

It would be nice to be able to set Admin user through ENV Vars, or at least have the option to.

timothycarambat commented 2 months ago

Remove the LLM_PROVIDER, EMBEDDING_ENGINE and VECTOR_DB keys in the storage .env. You can also reset the user db by yarn prisma:reset in the root or in /app/server when in docker container.

You can also just rm -rf /host/storage/anythingllm.db && touch /host/storage/anythingllm.db to delete and recreate an empty db

atljoseph commented 2 months ago

OK, this worked grrrrreat!!!!

The render image tag worked as expected.

Final product:

---
apiVersion: v1                                                                                                                                           
kind: PersistentVolume                                                                                                                                   
metadata:                                                                                                                                                
  name: anything-llm-volume                                                                                                                              
  annotations:                                                                                                                                           
    pv.beta.kubernetes.io/uid: "1000"                                                                                                                    
    pv.beta.kubernetes.io/gid: "1000"                                                                                                                    
spec:                                                                                                                                                    
  storageClassName: gp2                                                                                                                                  
  capacity:                                                                                                                                              
    storage: 5Gi                                                                                                                                        
  accessModes:                                                                                                                                           
    - ReadWriteOnce                                                                                                                                      
  awsElasticBlockStore:    
    # This is the volumne UUID from AWS EC2 EBS Volumes list.                                                                                                                              
    volumeID: "{{ anythingllm_awsElasticBlockStore_volumeID }}"                                                                                                                           
    fsType: ext4
  nodeAffinity:                                                                                                                                          
    required:                                                                                                                                            
      nodeSelectorTerms:                                                                                                                                 
      - matchExpressions:                                                                                                                                
        - key: topology.kubernetes.io/zone                                                                                                               
          operator: In                                                                                                                                   
          values:                                                                                                                                        
          - us-east-1c  
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: anything-llm-volume-claim
  namespace: "{{ namespace }}"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: anything-llm
  namespace: "{{ namespace }}"
  labels:
    anything-llm: "true"
spec:
  selector:
    matchLabels:
      k8s-app: anything-llm
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0%
      maxUnavailable: 100%
  template:
    metadata:
      labels:
        anything-llm: "true"
        k8s-app: anything-llm
        app.kubernetes.io/name: anything-llm
        app.kubernetes.io/part-of: anything-llm
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: /metrics
        prometheus.io/port: "9090"
    spec:
      serviceAccountName: "default"
      terminationGracePeriodSeconds: 10
      securityContext:                                                                                                                                                              
        fsGroup: 1000
        runAsNonRoot: true                                                                                                                                                          
        runAsGroup: 1000
        runAsUser: 1000
      affinity:                                                                                                                                                                                                                                                                          
        nodeAffinity:                                                                                                                                                                                                                                                                    
          requiredDuringSchedulingIgnoredDuringExecution:                                                                                                                                                                                                                                
            nodeSelectorTerms:                                                                                                                                                                                                                                                           
            - matchExpressions:                                                                                                                                                                                                                                                          
              - key: topology.kubernetes.io/zone                                                                                                                                                                                                                                         
                operator: In                                                                                                                                                                                                                                                             
                values:                                                                                                                                                                                                                                                                  
                - us-east-1c  
      containers:
      - name: anything-llm
        resources:
          limits:
            memory: "1Gi"
            cpu: "500m"
          requests:
            memory: "512Mi"
            cpu: "250m"
        imagePullPolicy: IfNotPresent
        image: "mintplexlabs/anythingllm:render"
        securityContext:                     
          allowPrivilegeEscalation: true                                                                                                                                                                                                                                                 
          capabilities:                                                                                                                                                                                                                                                                  
            add:                                                                                                                                                                                                                                                                         
              - SYS_ADMIN                                                                                                                                                                                                                                                                
          runAsNonRoot: true                                                                                                                                                                                                                                                             
          runAsGroup: 1000                                                                                                                                                                                                                                                               
          runAsUser: 1000                                                                                                                                       
        command: 
          # Specify a command to override the Dockerfile's ENTRYPOINT.
          - /bin/bash
          - -c
          - |
            set -x -e
            sleep 3
            echo "AWS_REGION: $AWS_REGION"
            echo "SERVER_PORT: $SERVER_PORT"
            echo "NODE_ENV: $NODE_ENV"
            echo "STORAGE_DIR: $STORAGE_DIR"
            {
              cd /app/server/ &&
                npx prisma generate --schema=./prisma/schema.prisma &&
                npx prisma migrate deploy --schema=./prisma/schema.prisma &&
                node /app/server/index.js
              echo "Server process exited with status $?"
            } &
            { 
              node /app/collector/index.js
              echo "Collector process exited with status $?"
            } &
            wait -n
            exit $?
        readinessProbe:
          httpGet:
            path: /v1/api/health
            port: 8888
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 2
        livenessProbe:
          httpGet:
            path: /v1/api/health
            port: 8888
          initialDelaySeconds: 15
          periodSeconds: 5
          failureThreshold: 3
        env:
          - name: AWS_REGION
            value: "{{ aws_region }}"
          - name: AWS_ACCESS_KEY_ID
            value: "{{ aws_access_id }}"
          - name: AWS_SECRET_ACCESS_KEY
            value: "{{ aws_access_secret }}"
          - name: SERVER_PORT
            value: "3001"
          - name: JWT_SECRET
            value: "my-random-string-for-seeding" # Please generate random string at least 12 chars long.
          - name: VECTOR_DB
            value: "lancedb"
          - name: STORAGE_DIR
            value: "/storage"
          - name: NODE_ENV
            value: "production"
          - name: UID
            value: "1000"
          - name: GID
            value: "1000"
        volumeMounts: 
          - name: anything-llm-server-storage-volume-mount
            mountPath: /storage                                                                                                                                                  
      volumes:
        - name: anything-llm-server-storage-volume-mount
          persistentVolumeClaim:
            claimName: anything-llm-volume-claim
---
# This serves the UI and the backend.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: anything-llm-ingress
  namespace: "{{ namespace }}"
  annotations:
    external-dns.alpha.kubernetes.io/hostname: "{{ namespace }}-chat.{{ base_domain }}"
    kubernetes.io/ingress.class: "internal-ingress"
    nginx.ingress.kubernetes.io/rewrite-target: /
    ingress.kubernetes.io/ssl-redirect: "false"
spec:
  rules:
  - host: "{{ namespace }}-chat.{{ base_domain }}"
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: anything-llm-svc
            port: 
              number: 3001
  tls: # < placing a host in the TLS config will indicate a cert should be created
    - hosts:
        - "{{ namespace }}-chat.{{ base_domain }}"
      secretName: letsencrypt-prod
---
apiVersion: v1
kind: Service
metadata:
  labels:
    kubernetes.io/name: anything-llm
  name: anything-llm-svc
  namespace: "{{ namespace }}"
spec:
  ports:
  # "port" is external port, and "targetPort" is internal.
  - port: 3301
    targetPort: 3001
    name: traffic
  - port: 9090
    targetPort: 9090
    name: metrics
  selector:
    k8s-app: anything-llm

@timothycarambat This is what is known as a kubernetes "manifest". We don't use Kustomize, so this would have to be adapted for anyone who uses that (I don't understand it). With Helm, we use the "render" function to apply variables that are loaded per environment. Our Helm repo is too large just to extract a full working filesystem example here, sorry. But, essentially... Variables are placed in handlebars, and then replaced at render time. someK8sConfigValue: "{{ willBeReplacedAtRender }}" So, anyone can fill in their values, and apply this directly to their cluster. I did not include an ingress, because different folks use different strokes, depending on their needs.

Examples:

atljoseph commented 2 months ago

Hoping this helps someone else !

atljoseph commented 2 months ago

Getting an error and empty scrren when visiting LLM Preferences from Settings.

image image
atljoseph commented 2 months ago

All other screens function. Not ure why this one wouldn't. Maybe because it is the one I need LOL.

image
atljoseph commented 2 months ago

I think this needs a way for Admin to reset everything from the frontend.

atljoseph commented 2 months ago

You can also just rm -rf /host/storage/anythingllm.db && touch /host/storage/anythingllm.db to delete and recreate an empty db

Did that... All the chat workspaces were cleared out...

Does that mean the users are not stored in anythingllm.db?

Remove the LLM_PROVIDER, EMBEDDING_ENGINE and VECTOR_DB keys in the storage .env.

None of these are set currently. At deploy/config time, we load ENV Vars from key-value store, and then apply the changes to kubernetes. We don't have an ENV file. Kubernetes supplies these, and if they are modified, they do not persist across restarts.

You can also reset the user db by yarn prisma:reset in the root or in /app/server when in docker container.

Can you tell me more about that, please? Where is this prisma DB stored? Looking to have a solution that does not involve exec-ing to an ephemeral container.

timothycarambat commented 2 months ago

@atljoseph that ENV error is because in

 env:
          - name: AWS_REGION
            value: "{{ aws_region }}"
          - name: AWS_ACCESS_KEY_ID
            value: "{{ aws_access_id }}"
          - name: AWS_SECRET_ACCESS_KEY
            value: "{{ aws_access_secret }}"
          - name: SERVER_PORT
            value: "3001"
          - name: JWT_SECRET
            value: "my-random-string-for-seeding" # Please generate random string at least 12 chars long.
          - name: VECTOR_DB
            value: "lancedb"
          - name: STORAGE_DIR
            value: "/storage"
          - name: NODE_ENV
            value: "production"
          - name: UID
            value: "1000"
          - name: GID
            value: "1000"

You have

- name: VECTOR_DB
  value: "lancedb"

Remove this key and you can onboard and that LLM preference bug wont appear. Onboarding set these vars

atljoseph commented 2 months ago

Thank you sir. Will try that tomorrow!

On Tue, May 21, 2024 at 10:34 PM Timothy Carambat @.***> wrote:

@atljoseph https://github.com/atljoseph that ENV error is because in

env:

  • name: AWS_REGION value: "{{ aws_region }}"
  • name: AWS_ACCESS_KEY_ID value: "{{ aws_access_id }}"
  • name: AWS_SECRET_ACCESS_KEY value: "{{ aws_access_secret }}"
  • name: SERVER_PORT value: "3001"
  • name: JWT_SECRET value: "my-random-string-for-seeding" # Please generate random string at least 12 chars long.
  • name: VECTOR_DB value: "lancedb"
  • name: STORAGE_DIR value: "/storage"
  • name: NODE_ENV value: "production"
  • name: UID value: "1000"
  • name: GID value: "1000"

You have

  • name: VECTOR_DB value: "lancedb"

Remove this key and you can onboard and that LLM preference bug wont appear. Onboarding set these vars

— Reply to this email directly, view it on GitHub https://github.com/Mintplex-Labs/anything-llm/issues/1463#issuecomment-2123759591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF632FLECBYSW2GIK6ODZHTZDP73ZAVCNFSM6AAAAABH7XQKCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTG42TSNJZGE . You are receiving this because you were mentioned.Message ID: @.***>

atljoseph commented 2 months ago

Worked. Everything above in my last message with k8s manifest (minus VECTOR_DB) worked as expected, especially after removing that env var.

reefland commented 1 month ago

@atljoseph - In your templates readinessProbe and livenessProbe are using port 8888, and while the logs report Document processor app listening on port 8888 I'm unable to get curl to respond to that port, I have to CTRL-C to stop.:

[ root@curl:/ ]$ curl -v anything-llm.anything-llm:8888/v1/api/health
^C

I don't see how your probes could be successful. I instead use probes against port 3001:

[ root@curl:/ ]$ curl anything-llm.anything-llm:3001
<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8" />
  <link rel="icon" type="image/svg+xml" href="/favicon.png" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>AnythingLLM | Your personal LLM trained on anything</title>
...

And path doesn't matter. Seems to get same output no matter what you ask for.

[ root@curl:/ ]$ curl anything-llm.anything-llm:3001/blah/blah/blah
<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8" />
  <link rel="icon" type="image/svg+xml" href="/favicon.png" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>AnythingLLM | Your personal LLM trained on anything</title>
...

In regards to Prometheus metrics, you have annotations for scrapes to port 9090 and setup that port for the service, but there is nothing within the Anything-LLM container listening to that port that I could find. If it doesn't exist that could be cleaned up.

I'd be curious to know why you needed to grant allowPrivilegeEscalation: true and grant SYS_ADMIN capabilities to the container:

       securityContext:                     
          allowPrivilegeEscalation: true                                                                                                                                                                                                                                                 
          capabilities:                                                                                                                                                                                                                                                                  
            add:                                                                                                                                                                                                                                                                         
              - SYS_ADMIN                                                                                                                                                                                                                                                                
          runAsNonRoot: true                                                                                                                                                                                                                                                             
          runAsGroup: 1000                                                                                                                                                                                                                                                               
          runAsUser: 1000

I find Anything-LLM runs OK, locked down a bit more at the container level:

      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
        readOnlyRootFilesystem: false

Then at the Pod level you can do:

  securityContext:
    fsGroup: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    runAsUser: 1000

For others trying this. The .env file concept does not work well with Kubernetes. In Kubernetes this would be stored in a configMap and either mounted into the container's filesystem or the configMap values mapped to ENV variables. Being that some of the values are sensitive then these would be stored in a secret and mapped to ENV variables.

Trying to mount the .env file from a configMap works when NODE_ENV='production' is used, unfortunately Anything-llm tries to update its .env file in development mode which breaks and crashes as configMap is read-only. Otherwise you have to do some silly trickery with an InitContainer where you could mount the .env file to something like /tmp and then use commands manually copy it to /app/server/ directory to make it writeable (but changes not persisted) before the anything-llm container is started.

timothycarambat commented 1 month ago

@reefland some commentary

In your templates readinessProbe and livenessProbe are using port 8888, and while the logs report Document processor app listening on port 8888 I'm unable to get curl to respond to that port, I have to CTRL-C to stop.:

This is normal in the docker container. The 8888 port is not supposed to be exposed however, it is your docker container so you can if you like. The default is to now because there is really no reason to have that be the case for the collector currently.

And path doesn't matter. Seems to get same output no matter what you ask for.

In production build/docker the root path of 3001 runs the frontend and 3001/api runs all backend endpoints for the app. Getting a response on any path is the HTML content being returns from an unmatched path while /api/ping indicates the backend server endpoints are running. In either case, if the backend was not online - not even the frontend would return a response.

I'd be curious to know why you needed to grant allowPrivilegeEscalation: true and grant SYS_ADMIN capabilities to the container:

This is in the official image for our Docker container. You cannot run a sandbox'd chromium browser without this capability on the base image we have unless you want to run the browser without sandboxing. Obviously, we want sandboxing - so that capability must exist. This is used for scraping websites so that SPA/JS apps can render first and we can scrape that.

As for the other comments, seems to be use case specific settings or tooling so I have nothing to add around that

atljoseph commented 1 month ago

Yo. It’s working fine for me.

On Sun, Jun 23, 2024 at 6:05 PM Timothy Carambat @.***> wrote:

@reefland https://github.com/reefland some commentary

In your templates readinessProbe and livenessProbe are using port 8888, and while the logs report Document processor app listening on port 8888 I'm unable to get curl to respond to that port, I have to CTRL-C to stop.:

This is normal in the docker container. The 8888 port is not supposed to be exposed however, it is your docker container so you can if you like. The default is to now because there is really no reason to have that be the case for the collector currently.

And path doesn't matter. Seems to get same output no matter what you ask for.

In production build/docker the root path of 3001 runs the frontend and 3001/api runs all backend endpoints for the app. Getting a response on any path is the HTML content being returns from an unmatched path while /api/ping indicates the backend server endpoints are running. In either case, if the backend was not online - not even the frontend would return a response.

I'd be curious to know why you needed to grant allowPrivilegeEscalation: true and grant SYS_ADMIN capabilities to the container:

This is in the official image for our Docker container. You cannot run a sandbox'd chromium browser without this capability on the base image we have unless you want to run the browser without sandboxing. Obviously, we want sandboxing - so that capability must exist. This is used for scraping websites so that SPA/JS apps can render first and we can scrape that.

As for the other comments, seems to be use case specific settings or tooling so I have nothing to add around that

— Reply to this email directly, view it on GitHub https://github.com/Mintplex-Labs/anything-llm/issues/1463#issuecomment-2185337089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF632FPTHIKEV3MKKPPUGHDZI5BBPAVCNFSM6AAAAABH7XQKCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBVGMZTOMBYHE . You are receiving this because you were mentioned.Message ID: @.***>

stguitar commented 1 month ago

would it make more sense for folks to have the committed/provided sample k8s manifests to work with a standard production build docker image?

I get that its somewhat a personal preference, but it seems that it would help to have the default k8s behavior align with the default docker setup.

Further, while these env settings might work, it would be more straight forward to just have an additional example resource of a configmap to configure those. then that would be directly applicable as a starting point, and folks could swap to secrets instead of they so choose.

I don't say all this without the offer of a PR, but I wanted to make sure before I do that, that it might be something that would be merged.

timothycarambat commented 1 month ago

@stguitar if you can produce a manifest/template for K8s that:

then I will merge it in for the cloud-deployments folder, no question!

If there are supplemental "gotchas" or key pieces a reasonable person may overlook and cannot be commented into the template I am happy to make a page and document those details on our documentation page as well.

atljoseph commented 1 month ago

Sorry, I didn’t really intend to provide a perfect file others could exactly copy, but one which held the required ideas, and actually functions for a given cluster. Of course some things could be improved. Every cluster is different. And some use this or that tool to configure things. The key points def include container permissions, env vars, volume persistence, and … use of the “render” image. Many hours were spent here learning that lesson the hard way.

timothycarambat commented 4 weeks ago

@atljoseph understandable and I was on the same page as you that this was your specific endeavor and use-case and it may not be portable to others.

The PR still stands though if someone does want to contribute a general-use K8 template

stguitar commented 4 weeks ago

Good to know guys... thank you for the replies and the effort on all this previously. I definitely didn't mean to not be thankful for what you have provided in this PR. I was curious - can either of you expand on the "render" image and what it's configuration is? Maybe it opens ports that the normal image doesn't.

I guess how it felt to me, and im presuming the same for @reefland was that if these files don't actually work at all (which they won't unless the "render" container tag exposes ports that the standard port does not), it seems plausible to just have text docs that list out the suggested resources one might need to deploy, and point folks to the standard (or even random) examples of those resources from the web. You could further call out the "special sauce" (like the shell scripting, and the pod's security context stuff).

I totally agree that people have different deployment needs. Heck, I won't be using AWS. However, these manifests seem to suggest that the ports of 9090 and 8888 are exposed, which I don't think they are. They aren't in the standard image anyway. Maybe they are in the render image? It just felt like the interaction (best word I can think of) of the pod and the container could be a bit tighter. Not so much the rest of the "environmental selection of the day" stuff, like env vars, secrets, config maps, ingresses (which i also don't happen to need). I did mention the latter stuff, but that wasn't as big of a gap for me personally as much as the pod/container cohesion.

That being said, I could be totally wrong and the "render" container actually exposes those ports and all is well!

Depending on your thoughts and at this point from my point of view, I am still testing out my version of the k8s manifests and should that go well, I would be happy to share what I have in a PR for consideration.

timothycarambat commented 4 weeks ago

The only public port that should be exposed in the docker container running AnythingLLM in any template is 3001. Def dont expose 8888 and looking at the current K8 community template 9090 seems to be use-case specific so we can likely omit that as well

I dont specifically use K8's - which is why I lean on the community to publish or amend that template. I truthfully have no experience in that area.

atljoseph commented 4 weeks ago

Don’t get confused by those healthchecks on port 8888. Those can simply be removed if you don’t utilize them. 9090 is metrics. Again, if you don’t have a need, then just remove them. These are all somewhat standard k8s concepts.

I haven’t been able to put my finger on why the render image works for k8s and the other doesn’t. Feels like it should just be possible to utilize the functional result that either image provides, but with just one image instead, and relying on env vars to affect configuration.

We also ended up going with a non-standard port in the end, like 3002, since 3001 is a standard node port and conflicted with another service. Works fine though.

On Wed, Jun 26, 2024 at 8:23 PM Timothy Carambat @.***> wrote:

The only public port that should be exposed in the docker container running AnythingLLM in any template is 3001. Def dont expose 8888 and looking at the current K8 community template 9090 seems to be use-case specific so we can likely omit that as well

— Reply to this email directly, view it on GitHub https://github.com/Mintplex-Labs/anything-llm/issues/1463#issuecomment-2192832048, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF632FONE4H45CFE4T7SZHDZJNLPPAVCNFSM6AAAAABH7XQKCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJSHAZTEMBUHA . You are receiving this because you were mentioned.Message ID: @.***>

reefland commented 4 weeks ago

The only public port that should be exposed in the docker container running AnythingLLM in any template is 3001. Def dont expose 8888 and looking at the current K8 community template 9090 seems to be use-case specific so we can likely omit that as well

Unless you include a Prometheus Exporter that listens on port 9090 as part of your image, then this port and respective annotations should be removed. It's not use-case specific if it simply does not exist.