-
## ❓ Questions and Help
#### What is your question?
Hi, I am getting "OOM during optimization, irrecoverable" when trying to fine-tune the 3.3B parameter NLLB model.
##### Stack trace:
```…
-
In this issue, [comment with `@njzjz-bot `](https://github.com/njzjz/wenxian/issues/23#issuecomment-2121998880):
```
@njzjz-bot 2312.15492
```
[The GitHub Actions will reply](https://github.co…
njzjz updated
5 months ago
-
### Question
Great work !
1. According to the paper, the batch size is set to 1152. How many graphics cards will be used during the training phase?
2. Is training full fine-tuning or efficient para…
-
Getting the following issue when running mosaic-bert recipe. Only with bf16, works with fp32.
```
Traceback (most recent call last):
File "", line 21, in _bwd_kernel
KeyError: ('2-.-0-.-0-d82…
-
hi @keyvank ,
let's say I want to add a new decoder layer (the one that gets constructed as part of 0..num_layers loop) at run time after the gpt::new() call, how do I go about it ? As I understand y…
-
https://virtual2023.aclweb.org/paper_P5680.html
-
kvstore_device allocate a buffer memory for each GPU, see https://github.com/dmlc/mxnet/blob/master/src/kvstore/kvstore_device.h#L87
It is problematic for big network such as VGG/Alexnet and there ar…
-
This is a long post and I apologize going in. I also want to make clear that none of this should be read as a criticism of BERTopic or the choices made of where the product has focused its attention. …
-
## 资源
- The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch. [[github]](https://github.com/ritchieng/the-incredible-pytorch)
-
I found with 2d5-7b the checkpoint saved from LoRA tuning finetune.py with one GPU is correct, while with multiple GPU the model saved is incorrect.
Does anyone met similar problem?
For example…