spot-instances Search Results

1000+ results
for spot-instances

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

aws/karpenter-provider-aws #7428

karpenter cluster_state.synced metrics drops to 0 and stop …

### Description **Observed Behavior**: karpenter.cluster_state.synced metrics drops to 0 for ~30 minutes and no new nodes can come up, causing lots of pods pending. Worstly, we are undergoing sp…

Yufeireal updated 2 days ago
2
mozilla/telemetry-analysis-service #229

Add other types of spot instances

@vitillo: Analysis jobs have different requirements in terms of hardware resources; some might benefit from more cores while others might benefit from more memory. Our users would like to select the i…

jezdez updated 7 years ago
2
orcasound/aifororcas-livesystem #204

Optimize inference system AKS cluster

Inference system currently runs on AKS cluster with 3 Standard B4ms (4 vcpus, 16 GiB memory) VMs. Optimize the usage: 1. Adjust the pod resource requests if not used 2. Adjust the VM SKU to match …

micya updated 1 month ago
1
coiled/benchmarks #1241

Restart benchmarks if spot instances get replaced

Currently, we start benchmark clusters with the `spot_with_fallback` policy. While this makes sense from a cost perspective, spot replacement will mess up the results. When running benchmarks, we shou…

hendrikmakait updated 11 months ago
6
doitintl/gcpinstances.info #12

Add support for new "Spot" instances

GCP recently announced an evolution of preemptible VMs. They are now called Spot VMs. More info here: https://cloud.google.com/compute/docs/instances/spot

bo-qeye updated 1 year ago
7
clusterinthecloud/support #31

Add support for AWS Spot instances

We should support running spot instances on AWS. Things to consider: - should it be per-instance or global per-cluster? - how do we get notified of an impending termination? - how should we hand…

milliams updated 3 years ago
1
ethersphere/swarm #1348

spot instances are getting very unstable

We should probably review our k8s setup with respect to spot instances, or instance types. Over the last few weeks, the cluster is very unstable, with machines starting/stopping as soon as i start cre…

nonsense updated 5 years ago
3
bank-vaults/vault-secrets-webhook #254

Intermittent Issue with Pod Mutation Using Vault Secrets Web…

Hello Vault Secrets Webhook Team, I am currently using the Vault Secrets Webhook Helm chart version 1.19.0 for secret injection into pods. My setup, including the values.yaml, works well most of th…

aleksandrovpa updated 1 month ago
14
pytorch/torchtitan #561

Fail-safe and partial redundancy for HSDP on unreliable comp…

I'd like to propose a feature for implementing fail-safe mechanisms and partial redundancy in FSDP2 (possibly not FSDP already, more like HSDP) to allow for more robust training on unreliable compute …

evkogs updated 3 days ago
7
kubernetes-sigs/karpenter #1762

Karpenter will occassionally provision nodes that are way to…

### Description *Note that I am cross-posting this from https://github.com/aws/karpenter-provider-aws/issues/7254 as the more I look into the issue, the more it seems to be related to core Karpenter …

fullykubed updated 1 month ago
3

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for spot-instances

1000+ results
for spot-instances