-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing…
-
- TensorFlow version (you are using): 2.0.0b0
- Are you willing to contribute it (Yes/No): No
**Describe the feature and the current behavior/state.**
Currently, sparse tensors don't seem to be s…
novog updated
3 months ago
-
**Describe the bug**
Unable to use/test fp6 quantization in deepspeed 0.14 in inference mode on a GPT2 model. There is little documentation on usage right so not sure if I have the wrong init metho…
-
it makes things difficult when we want to handle tensor parallelism .....
-
See https://docs.xarray.dev/en/stable/
If I understand correctly, an `xarray` object is made up of the actual `data` array (np.ndarray), and ~~1-D~~ `coordinates` arrays (dictionaries?) that map `d…
-
I refer to this [Issue](https://github.com/NVIDIA/TensorRT-LLM/issues/394) and want to use my own data set to obtain the scale value of SQ.
The scenario is 70b tp=2, the length of input_ids is not …
-
### 🐛 Describe the bug
I have a setup where I am manually sharding weights for 2D parallelism and then constructing this as a DTensor using DTensor.from_local. Everything seems to work fine, except i…
-
### 🐛 Describe the bug
Code to reproduce
``` python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
path = "gpt2" # any LM would result the same
tokenizer = AutoTok…
-
Thanks for sharing the awesome repo.
I've been utilizing Accelerate for training LLMs. My current setup involves using Deepspeed Zero-3 for training a 70B parameter LLaMA-2 model, with a sequence l…
-
### System Info
```Shell
python 3.8
pytorch 1.12
openmpi 4.1.0
cuda 11.3
cudnn8
ubuntu 20.04
accelerate==0.14.0
transformers==4.24.0
bitsandbytes==0.35.4
1 node with 4xT4 GPUs
```
…