BorisPolonsky / dify-helm

Deploy langgenious/dify, an LLM based app on kubernetes with helm chart
MIT License
142 stars 35 forks source link

[QUESTION] S3 storage and serviceAccount for api/worker #83

Closed tienhim closed 2 days ago

tienhim commented 2 weeks ago

Hello BorisPolonsky,

I am deploying dify to AWS EKS. I will use AWS redis and rds postgres for the DB.

However, the EBS volume does not support ReadWriteMany, so I am looking for another storage type for api/worker

I could not find the docs that could answer my concerns so I would like to ask you something

  1. Does the api/worker support S3 as storage?
  2. Does the dify helmchart support configuring serviceAccount for api/worker so that they can use S3 as storage

The only serviceAccount config I can find in the helmchart is for Weaviate and Redis

Hope to have your responses soon

Thank you

BorisPolonsky commented 2 weeks ago

It's defined here.

tienhim commented 1 week ago

@BorisPolonsky, thanks so much for your response! I configured external Redis here Since I used AWS ElastiCache as external Redis, there is no authentication configuration for the Redis, so I left the username/password blank

externalRedis:
  enabled: true
  host: "master.*****.*****.use1.cache.amazonaws.com"
  port: 6379
  username: ""
  password: ""
  useSSL: True

Then I configured externalEnv for the api/worker as following:

  - name: REDIS_HOST
  value: "master.*****.*****.use1.cache.amazonaws.com"
- name: REDIS_PORT
  value: "6379"
- name: REDIS_USERNAME
  value: ""
- name: REDIS_PASSWORD
  value: ""
- name: REDIS_USE_SSL
  value: "True"

However, the worker pod throw the below error. It seems like the worker did not get the REDIS_HOST that I defined in the variable

2024-07-15 04:03:09,876: ERROR/MainProcess] consumer: Cannot connect to redis://:**@redis:6379/1: Error -2 connecting to redis:6379. Name or service not known..
Trying again in 24.00 seconds... (12/100)

Could you please advise?

BorisPolonsky commented 1 week ago

@BorisPolonsky, thanks so much for your response! I configured external Redis here Since I used AWS ElastiCache as external Redis, there is no authentication configuration for the Redis, so I left the username/password blank

externalRedis:
  enabled: true
  host: "master.*****.*****.use1.cache.amazonaws.com"
  port: 6379
  username: ""
  password: ""
  useSSL: True

Then I configured externalEnv for the api/worker as following:

- name: REDIS_HOST
value: "master.*****.*****.use1.cache.amazonaws.com"
- name: REDIS_PORT
value: "6379"
- name: REDIS_USERNAME
value: ""
- name: REDIS_PASSWORD
value: ""
- name: REDIS_USE_SSL
value: "True"

However, the worker pod throw the below error. It seems like the worker did not get the REDIS_HOST that I defined in the variable

2024-07-15 04:03:09,876: ERROR/MainProcess] consumer: Cannot connect to redis://:**@redis:6379/1: Error -2 connecting to redis:6379. Name or service not known..
Trying again in 24.00 seconds... (12/100)

Could you please advise?

There's no need to apply environment variables for redis through extraEnvs if you have specified them in externalRedis. If you insist, environment variables defined in api.extraEnvs would override those configurations defined in externalRedis.

The message Name or service not known suggests that you have configured a domain that your cluster cannot resolve. You may need to figure out the reason from aws if it's this domain is not resolvable from other enviroment either. e.g. https://devops.stackexchange.com/questions/13591/aws-elasticache-redis-dns-error-name-or-service-not-known

tienhim commented 4 days ago

@BorisPolonsky thanks for that. The problem is I configured wrong URL for celery.

Last question (hopefully) I would like to ask is does the dify helmchart support configuring serviceAccount for api/worker? (I asked in the first comment, you might missed it)

I found no serviceAccount config in the templates. Without serviceAccount, I have to use secret-key/access-key for the S3 access which is not best practice

BorisPolonsky commented 2 days ago

@BorisPolonsky thanks for that. The problem is I configured wrong URL for celery.

Last question (hopefully) I would like to ask is does the dify helmchart support configuring serviceAccount for api/worker? (I asked in the first comment, you might missed it)

I found no serviceAccount config in the templates. Without serviceAccount, I have to use secret-key/access-key for the S3 access which is not best practice

No. This feature is available to AWS EKS only and we do not implement features that are limited to specific distributions of kubernetes.