-
The current way multi-slicing training is not robust or reliable on spot instances.
There have been some discussion inside CRFM and with GCP team on this topic. I create this issue to capture the …
-
For Amazon EC2 instances, I like to record whether the instance lifecycle is `normal` or `spot`, in order to differentiate metrics between the two. Spot instances tend to be short(er) lived.
I'm cu…
-
### What’s the bug you are facing?
The problem we are facing is described in this issue: #2589
Right now it's not clear that doing something like that:
```js
new Editor({
extensions: [Chara…
bdbch updated
1 month ago
-
### Preflight Checklist
- [X] I agree to follow the [Code of Conduct](https://github.com/deckhouse/deckhouse/blob/main/CODE_OF_CONDUCT.md) that this project adheres to.
- [X] I have searched the [iss…
-
**Is your feature request related to a problem? Please describe.**
We would like to have CloudWatch metrics about our Spot instances such as `FulfilledCapacity` and `TargetCapacity` we can automatica…
-
### What happened + What you expected to happen
With config `region: us-east-1` - always the last AZ is selected (`us-east-1f`) for AWS spot request.
When I list all AZs manually (`availability_…
-
Can the core array API ops of cubed be implemented in jax, s.t. everything easily compiles to accelerators? Could this solve the common pain point of running out of GPU memory? How would other constra…
-
```
gcloud compute instances create arm --image-project debian-cloud --image-family debian-11-arm64 --machine-type t2a-standard-1 --zone europe-west4-a --provisioning-model spot --accelerator type=n…
-
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the…
-
Currently we're leaving the API error raising to Faraday. That is not ideal since it's not possible to spot which Rubygpt call causes the issue.
Rubygpt must have it's own error handling.
### Im…