-
## 🚀 Feature
Now that we can use Weights and Biases on the Facebook cluster, it would be really neat if there was support for it within VISSL.
## Motivation & Examples
WandB is like TensorBoar…
-
## 🐛 Bug
Hi,
I am running an mBART model to summarize Turkish news on Google Colab. I have mostly followed the instructions at [the official example](https://github.com/pytorch/fairseq/tree/master…
-
Hi,
I ran the code of test, and the results are shown as blow.
![image](https://user-images.githubusercontent.com/21001460/132433338-27430902-fae8-4812-b23e-b9f09c7233d5.png)
I wonder whether the…
-
## ❓ Questions and Help
### Before asking:
1. search the issues.
1. search the docs.
#### What is your question?
Hi, I'm doing training from scratch using deepspeed, pytorch lightning…
-
I'm trying to set up this node but i keep getting the following error:
This is the system i'm running on:
Total VRAM 6144 MB, total RAM 16337 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 N…
-
是否能支持多卡训练
-
### 🐛 Describe the bug
I got an error when I trained Bert large with GeminiDDP:
Error location >> self.optimizer.backward(loss)
error message:RuntimeError: ("ZERO DDP error: the synchronizatio…
-
Hi,
Not that I'm asking `main-wds.py` to support every flag there is in the original pytorch/examples' imagenet, but I want to double-check that my understanding is correct (i.e., this is a questio…
-
```python
5: [rank5]: File "/workspace/megatron/core/transformer/transformer_block.py", line 493, in forward
5: [rank5]: hidden_states, context = layer(
5: [rank5]: File "/workspace/megatro…
-
In PyTorch distributed training, I get:
```
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/torch/engine.py", line 198, in Engine.init_train_from_config
…