-
Hello,
I am using Deepspeed 0.14.4, and converting checkpoints for a model. I can change the data and pipeline parallel size ok, and it will convert, load and resume training ok. However, when I …
-
# Bug reports
We are using go-prompt and once in a while when we get into the CLI prompt, a "divide by zero" error is seen
panic: runtime error: integer divide by zero
goroutine 1 [running]:
…
-
**Describe the bug**
`auto device = sycl::make_device((_ze_device_handle_t*)hDevice);`
fails at run.
```
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): Native…
-
`linalg::Svd()` outputs NaN when the input tensor contains only zeros. This issue only happens on GPU and doesn't happen when the data type is float. This bug is the culprit of the broken `Svd.gpu_U1_…
-
First of all, thank you for providing this extension. Assuming it is still maintained, I'd like to motivate to integrate a way to set configurations for phpcs.
As discussed in https://github.com/sq…
-
### Steps to reproduce
[Link to live example](https://github.com/o-alexandrov/material-ui-pigment-css-vite-ts)
Steps:
1. Clone the repo (it's exactly the same as the vite [example from this rep…
-
## 🐛 Bug
In the ONNXRuntime implementation, during the processing of the timestamp outputs, an array-index-out-of-bounds exception will crash the program with a segmentation fault.
### To Reproduc…
-
Hello everyone, I want to learn the stage3 part of zerooffload when learning the source code of deepspeed, but I can't find the scheduling process code of the gradient between cpu and gpu, please help…
-
### Bug description
Hello,
After 2.2 Lightning upgrade and only with deepspeed stage 3, we experience a crash `backward pass is invalid for module in evaluation mode`. Most likely is caused by the r…
-
### Describe the issue
I have a sample ONNX file with a QLinearConv block as the attached file. When running it with a specific input using onnxruntime, the inference output is different from what …