Open dcmcand opened 8 months ago
We have two options to achieve this:
Ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/ The sequence of events:
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: "worker_tasks"
target:
type: AverageValue ## This needs to change accordingly.
averageValue: 0
Ref: https://blogs.halodoc.io/autoscaling-k8s-deployments-with-external-metrics/ https://keda.sh/docs/2.13/scalers/ https://keda.sh/docs/2.13/concepts/external-scalers/ https://keda.sh/docs/2.13/scalers/rabbitmq-queue/ https://keda.sh/docs/2.13/scalers/redis-cluster-lists/ https://keda.sh/docs/2.13/scalers/redis-lists/ https://keda.sh/docs/2.13/scalers/postgresql/
The PGSql scaler allows us to run a query on a database. Which means we can simply point it towards the existing conda-store database to get the queue depth of pending jobs.
Regardless of the option we take, this can be moved upstream to conda-store.
We should agree on these before we start. Please suggest. Thanks.
@pt247 Conda store already has a queue, it is using redis and celery. I expect we can pull queue depth from that, so we shouldn't need to deploy extra infra there. The nebari-conda-store-redis-master stateful set is what you are looking for.
I am unfamiliar with KEDA, but it does look promising and has a redis scaler too. In general I prefer to use built in solutions as my default, so the horizontal autoscaler was my first thought, but if KEDA allows for better results with less complexity then I can see going with that. KEDA is a cncf project that seems to be actively maintained, so that is good.
As to whether this solution belongs in conda-store, I will simply say, it does not. Conda-store allows for horizontal scaling by having a queue with a worker pool. That is where conda-store's responsibility ends. Building specific implementation details for scaling on Nebari into conda-store would cross software boundaries and greatly increase coupling between the projects. That would be moving in the wrong direction. We want to decrease coupling between conda-store and Nebari. conda-store has a method for scaling horizontally, it is on Nebari to implement autoscaling that fits its particular environment.
I bet conda store devs would have comments on this, and it would be implemented in Conda store. It seems like this issue should transferred to the conda store repo to improve visibility with conda store devs.
We want to decrease coupling between conda-store and Nebari. conda-store has a method for scaling horizontally, it is on Nebari to implement autoscaling that fits its particular environment.
I also agree that the conda-store already has a sound scaling system; however, we are not using this on our own deployment. Having multiple celery workers is already supported (as both Redis and Celery handle the task load balancing by themselves); we need to discuss how to handle the worker scaling on our Kubernetes infrastructure.
It's a manual process that depends on creating more workers. We need a way to automate this process. I initially suggested using the queue depth on Redis to manage this, which would trigger a CRD to change the number of replicas the worker deployment should have.
Either KEDA or the horizontal autoscaler would work here and both can be used to scale automatically using the queue depth. I think that KEDA seems a bit more elegant with its implementation so would suggest using that to start to see if it works and if for some reason it doesn't, then falling back to the horizontal autoscaler.
helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace dev
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: scaled-conda-worker
namespace: dev
spec:
scaleTargetRef:
kind: Deployment # Optional. Default: Deployment
name: nebari-conda-store-worker # Mandatory. Must be in the same namespace as the ScaledObject
triggers:
- type: postgresql
metadata:
query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
targetQueryValue: "0"
activationTargetQueryValue: "1"
host: "nebari-conda-store-postgresql"
userName: "postgres"
password: "{nebari-conda-store-postgresql}"
port: "5432"
dbName: "conda-store"
sslmode: disable
I have also tried this:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: scaled-conda-worker
namespace: dev
spec:
scaleTargetRef:
kind: Deployment # Optional. Default: Deployment
name: nebari-conda-store-worker # Mandatory. Must be in the same namespace as the ScaledObject
triggers:
- type: postgresql
metadata:
query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
targetQueryValue: "0"
activationTargetQueryValue: "1"
host: "nebari-conda-store-postgresql.dev.svc.cluster.local"
passwordFromEnv: PG_PASSWORD
userName: "postgres"
port: "5432"
dbName: "conda-store"
sslmode: disable
I am getting the following error:
â 2024-04-05T18:44:42Z ERROR Reconciler error {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind" â
â : "ScaledObject", "ScaledObject": {"name":"scaled-conda-worker","namespace":"dev"}, "namespace": "dev", "name": "scaled-conda-work â
â er", "reconcileID": "17f8e76e-7f9d-4e9e-90e4-77dde8a455d4", "error": "error establishing postgreSQL connection: failed to connect â
â to `host=nebari-conda-store-postgresql.dev.svc.cluster.local user=postgres database=conda-store`: server error (FATAL: password au â
â thentication failed for user \"postgres\" (SQLSTATE 28P01))"}
Uhm, this is strange behavior; I think something might be missing... I will try to reproduce this on my side as well.
I have also tried TriggerAuthentication:
apiVersion: v1
kind: Secret
metadata:
name: conda-pg-credentials
namespace: dev
type: Opaque
data:
PG_PASSWORD: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-auth-conda-secret
namespace: dev
spec:
secretTargetRef:
- parameter: password
name: conda-pg-credentials
key: PG_PASSWORD
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: scaled-conda-worker
namespace: dev
spec:
scaleTargetRef:
kind: Deployment # Optional. Default: Deployment
name: nebari-conda-store-worker # Mandatory. Must be in the same namespace as the ScaledObject
triggers:
- type: postgresql
metadata:
query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
targetQueryValue: "0"
activationTargetQueryValue: "1"
host: "nebari-conda-store-postgresql"
userName: "postgres"
port: "5432"
dbName: "conda-store"
sslmode: disable
authenticationRef:
name: keda-trigger-auth-conda-secret
This worked:
It turns out that the secrets need to be base encoded.
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: trigger-auth-postgres
namespace: dev
spec:
secretTargetRef:
- parameter: password
name: nebari-conda-store-postgresql
key: postgresql-password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: scaled-conda-worker
namespace: dev
spec:
scaleTargetRef:
kind: Deployment
name: nebari-conda-store-worker
triggers:
- type: postgresql
metadata:
query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
targetQueryValue: "1"
host: "nebari-conda-store-postgresql"
userName: "postgres"
port: "5432"
dbName: "conda-store"
sslmode: disable
authenticationRef:
name: trigger-auth-postgres
We try and create 5 conda environments the fifth environment we add sciket-learn.
Time: 5 minutes 11 seconds Number of conda-store workers: 1
Time: 4 minutes 29 seconds Number of conda-store workers scaled to: 2
Time: 2 minutes 35 seconds Number of conda-store workers scaled to: 2
Time: 4 minutes 14 seconds
minReplicaCount: 1 # Default: 0
pollingInterval: 5 # Default: 30 seconds
cooldownPeriod: 60
time taken: 3:40
We need a conda-store worker to be alive to start a Jupyter notebook as it depends on NFS-share. As we can not scale conda-store workers to zero. There is little cost benefit to making this change.
Additionally, as observed in the PR comment, we can not scale up the conda-store workers beyond the general node. I have closed the PR, and this ticket can stay in the backlog until we figure out a better way of scaling conda-store-worker beyond the general node.
Feature description
Currently conda-store is set to allow 4 simultaneous builds at once. This is a bottleneck once multiple environments start getting built at once and presents a scaling challenge. If we set the simultaneous builds to 1 and autoscale based on queue depth then we should be able to handle scaling far more gracefully
Value and/or benefit
Having the conda-store workers autoscale based on queue depth will allow larger orgs to take advantage of Nebari without hitting scale bottlenecks.
Anything else?
https://learnk8s.io/scaling-celery-rabbitmq-kubernetes