-
I noticed a new network (fe3f6...) in http://zero.sjeng.org/networks/
Can you provide some information about win rate over the previous best network?
Number of games it was trained on? How long it…
-
## Describe the bug
`SACLoss` has flawed checks for determining the nature of `vmap_randomness`. Therefore, stochastic modules cannot be used in constituent networks.
## To Reproduce
Steps to…
-
It seems to me that a bidirectional connection becomes unbounded with some common settings.
**To Reproduce**
For connection between node_a and node_b: Set `connection__to_node` and `connection__…
-
when I load pretraind weights of vmamba/vssm1_tiny_0230s_ckpt_epoch_264.pth in mmdetction, it shows:
```
Successfully load ckpt /checkpoints/vmamba/vssm1_tiny_0230s_ckpt_epoch_264.pth
Failed load…
-
Hello
I'm running Windows 10 and I would like to install DeepSpeed to speed up inference of GPT-J. My system is the following:
```
Windows 10
cuda 11.6
torch 1.13.0
Python 3.9.12
```
Whe…
-
## Environment info
- `transformers` version: 4.8.1
- Platform: Linux-4.15.0-140-generic-x86_64-with-debian-buster-sid
- Python version: 3.7.10
- PyTorch version (GPU?): 1.9.0 (True)
- Tensor…
-
Provide as much information as possible. At least, this should include a description of your issue and steps to reproduce the problem. If possible also provide a summary of what steps or workarounds y…
-
**Describe the bug**
I try to use deepspeed ZERO-3 with huggingface Trainer to finetune a galactica 30b model (gpt-2 like), with 4 nodes, each 4 A100 gpu. I get oom error though the model should fit …
-
I get this error
```bash
C:\Users\techn>pip install intel-extension-for-pytorch
ERROR: Could not find a version that satisfies the requirement intel-extension-for-pytorch (from versions: none)
E…
-
Related to #227
Currently, when the parameters of distributions are given to models, they are listed as unstructured fields whose meaning depends on their name, e.g.: `aNrmInitMean`. `aNrmInitStd`…