-
Hi, thanks for the incredible library! We've been using pytorch metric learning for a task which requires around 300,000 images belonging to a lot of classes. We're quite new to metric learning and DD…
-
I wrote a toy training loop to get something going with fp8 and ran into this padding related issue. I managed to solve it by just replacing a single line in my code by `texts = ["Example text input 1…
-
Hi, I want to report a issue that I found while running mlm.sh for deberta-base.
## Description
- Using mlm.sh script for distributed training with more than 1 nodes causes a hang.
- I have tracked…
-
### System Info
- `transformers` version: 4.39.2
- Platform: Linux-4.18.0-425.19.2.el8_7.x86_64-x86_64-with-glibc2.28
- Python version: 3.10.13
- Huggingface_hub version: 0.22.2
- Safetensors ver…
-
### Description
Here is my use case:
I have 4 gpu nodes for training (including compute tensors) on aws.
I want to save pre-computed tensors to deeplake (Dataset/database/vectorstore), aiming to …
-
#24 adds a multi-GPU PyTorch example that demonstrates how to use Distributed Data Parallel training. However, training with multiple GPUs does not speed up training in the example. See https://gith…
-
### System Info
![image](https://github.com/huggingface/transformers/assets/15103470/2a840cb5-7e2b-4ce4-9a6a-6287508d0970)
Using GPU in script: A100 80 GB; Driver Version: 550.54.15; CUDA-Version: 1…
-
Hello royorel!
First thanks for your previous suggestion with the volume rendering part, it works for me now.
But I then got a problem with the full pipeline part, when I use 1 GPU everything work…
-
2024-06-19 15:08:43 INFO Loading settings from ./outputs/config_lora-20240619-150835.toml... train_util.py:3744
…
-
**Describe the issue**
The following error occurred while running "tutorials 09_dpr_training" in Google Colab.
**To Reproduce**
https://haystack.deepset.ai/tutorials/09_dpr_training
https://cola…