-
Does the codebase support 8-bit training similar to peft library?
I was trying to fine-tune on llama2-7b on 24Gb 4090 cards. Below is the error I got:
File "/home/nlp/JORA/examples/train.py", li…
yctam updated
6 months ago
-
**Describe the bug**
ttnn.argmax fails with error "Buffer size and page size should be larger than 0 bytes!" for the test case of shape [1, 50265].
**To Reproduce**
Steps to reproduce the behavio…
-
### Proposed new feature or change:
Require the support for integrating User DTypes with `numpy.finfo` as currently it raises the ValueError
```
In [2]: np.finfo(QuadPrecDType)
-------------------…
-
菜鸡运行的时候遇到的问题,但不知道是哪里可能有问题
Traceback (most recent call last):
File "run_gaussian_shading.py", line 148, in
main(args)
File "run_gaussian_shading.py", line 21, in main
pipe = Inversabl…
-
### System Info
```
bitsandbytes==0.43.1
sentencepiece==0.1.97
huggingface_hub==0.23.2
accelerate==0.30.1
tokenizers==0.19.1
transformers==4.41.1
trl==0.8.6
peft==0.11.1
datasets==2.14.6
…
-
From [doc](https://docs.vespa.ai/en/reference/document-json-format.html#tensor)
> Short form for indexed tensors representing binary data (with int8 cell value type): May use a string with a hex d…
-
Save Completed!
Save Completed!
Save Completed!
Start parsing settings...
Start parsing settings...
-
**Describe the bug**
ttnn.max_pool2d supports only height_sharded input tensor. Need support for block_sharded and width_sharded input.
**To Reproduce**
Steps to reproduce the behavior:
1. Chec…
-
**Please describe the bug**
When creating a toy model using ShardParallel/Zero2/PipeshardParallel and bfloat16, the first step works, but subsequent steps crash citing an error in the arguments to nc…
-
Traceback (most recent call last):
File "/ossfs/workspace/sft/sft_all.py", line 161, in
train()
File "/ossfs/workspace/sft/sft_all.py", line 125, in train
Traceback (most recent call last…