-
**Describe the bug**
When I load CLIP via CLIPVisionModel.from_pretrained("openai/clip-vit-large-patch14-336") within deepspeed, none of the model weights are loaded (i.e. they are a tensor of size z…
-
**Describe the bug**
A clear and concise description of what the bug is.
The issue I am facing is that of the assertion on [L316 of partitioned_param_coordinator.py](https://github.com/microsoft/Dee…
-
**Describe the bug**
I am trying to run the non-persistent example given for mistralai/Mistral-7B-Instruct-v0.3 on a RTX A6000 GPU (on a server) so compute capability is met, ubuntu is 22.04, CUDA to…
-
**Describe the bug**
I'm currently using the HF Trainer for training, with the HF learning rate scheduler and DeepSpeed optimizer. I've encountered an issue with loading universal checkpoints. The HF…
-
**Describe the bug**
I got the error `RuntimeError: The expanded size of the tensor (2048) must match the existing size (1179648) at non-singleton dimension 1. Target sizes: [2048, 2048]. Tensor …
Atry updated
5 months ago
-
I've been trying to get this installed but it consistently fails at installing Openfold. Following the readme (I had to manually install the packages listed in environment.yml) it all works OK until I…
-
Dear ESM Team
I have been trying to install ESM fold on a local server. I used the conda environment file and made sure my g++ is between is satisfied (tried with 10, 9 and 8). However, when it co…
-
Related issue: https://github.com/microsoft/DeepSpeed/issues/5724#issuecomment-2330819411
But I tried the solution and found it didn't work in my setting.
**Describe the bug**
[rank1]: Traceback …
-
i am running alphafold v2.3.2 to predict a multimer on 16CPU,241G,4* V100
```
>3JA9_1|Chains A|Proliferating cell nuclear antigen|Homo sapiens (9606)
MFEARLVQGSILKKVLEALKDLINEACWDISSSGVNLQSMDSSHV…
-
After installing deepspeed 0.15.0 via pip3, I ran ds_report to get compatibility of various features.
I get the following messages when looking for GDS compatibility:
```
[2024-08-29 15:16:37,…