-
I'm sure this has been asked, but when and if a desktop or Web app coming for Tasks.org
-
Summary:
@pcodes: Given how many components there are to this paper, it would have been nice to see a more comprehensive example that demonstrated all functionality.
@samfrey: Even though the paper …
-
Fix problems:
- [x] non-utf-8 message finalizes api-server app
- [x] api-server returns package with empty auth-token and zero id if invalid json has been sent to him
- [ ] request to executable …
-
I trained a Llama2-3B model using OpenRLHF and it trained fine. But when I shifted to the 7B version of the model, I had to shift to multiple nodes and encountered this error. After contacting the sup…
-
# Resource List:
## Cloud Developer:
### Training:
BALLARD:
GABRIEL:
JOSH:
[Google Partner Skills boost](https://partner.cloudskillsboost.google/)
[Awesome list](https://github.com/…
-
### System Info
```Shell
accelerate 0.20.3
python 3.10
numpy 1.24.3
torch 2.0.1
accelerate config:
compute_environment: LOCAL_MACHINE
deepspeed_config:
deepspeed_multinode_launcher: stand…
-
## 🐛 Bug
Undesired interaction between DeepSpeed and XLA
## To Reproduce
Steps to reproduce the behavior:
1. `pip install torch lightning`
1. `pip install https://storage.googleapis.com/t…
-
Hello everyone, I'm encountering a memory issue while fine-tuning a 7b model (such as Mistral) using a repository I found. Despite having 6 H100 GPUs at my disposal, I run into out-of-memory errors wh…
-
您好,在微调52B时出现了如何报错,具体是在保存模型中转化 Lora层参数时。代码卡在 TeleChat-52B/deepspeed-finetune/utils/module/lora.py -> convert_lora_to_linear_layer -> with deepspeed.zero.GatheredParameters(),使用 zero3+Lora,报错信息如下:
epoc…
-
Assume we are trying to learn a sequence to sequence map. For this we can use Recurrent and TimeDistributedDense layers. Now assume that the sequences have different lengths. We should pad both input …