-
- Set up EKS on AWS. Later we may setup GKE on GCP, and AKS on Azure.
- Document setup steps for each cloud provider in the README.md.
- Add necessary Kubernetes credentials (e.g., KUBECONFIG_DATA f…
-
### What Happened?
Good example about PODS running in different nodes, anyway I think that the deployment .yaml file needs more explanation about affinity and the fact that the Target port = http sha…
-
Hi,
Let's say, I have a slurm cluster that contains 100 nodes, each node has 100 cores. Assuming I have 10000 tasks.
This is my current code:
```
dist_executor = SlurmPipelineExecutor(
…
-
rdvz fail to work with SkyPilot multi-node cluster (probably on k8s).
https://github.com/stas00/ml-engineering/blob/master/network/benchmarks/all_reduce_bench.py
_Version & Commit info:_…
-
### Related problem
_No response_
### Suggested solution
Strimzi supports multi-version single step upgrade for Zookeeper based clusters, however downgrades must be handled step by step ensuring th…
-
As a MKS administrator
I want to spawn Kubernetes cluster distributed on multiple low-latency availability zones
So that I can spread worker nodes accros regions and benfit from an ever better HA of…
-
I am currently experiencing an issue with SCTP connectivity between two Kubernetes clusters while utilizing Submariner for multi-cluster communication.
**Environment Details:**
**Clusters:** Two s…
-
Clusters with a single tenant use the `system tenant` for both system metadata and user data. TenantOne is a special tenant that acts as a `system tenant` and uses no tenant prefix for the keys. New c…
-
**What happened**:
I set up Multi Kueue as per the [documentation](https://kueue.sigs.k8s.io/docs/tasks/manage/setup_multikueue/).
I removed `"jobset.x-k8s.io/jobset"` from the list of integra…
-
### Describe the bug
We have setup of clusters in multiple regions. But dev and prod clusters are present in the different regions.
Application deployments should be the same in region 1 dev cluster…