-
### 🚀 The feature, motivation and pitch
The official implementation of flash attention is in CUDA, so in AMD GPUs, users cannot easily use flash attention on transformers to training LLM. With the …
-
**Followed this** [Tutorial](https://youtu.be/vH-AotLFkj8?si=eyQdvkKJjK1hXrau)
**Getting Error:**
"thammegowda-nllb-serve" requires a web server running locally!
**Trying to Translate Japanese to E…
-
I tried to follow the tutorial from here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb but I've got very bad res…
-
### Add Link
https://pytorch.org/tutorials/beginner/translation_transformer.html
### Describe the bug
Running the tutorial on language translation with transformers leads to nans when trainin…
-
Is this out of scope? I hope not, would be nice to have a one-stop shop for interpretability tooling.
### Proposal
It should be easy to get the most bare-bones interpretability research off the…
-
Port CLIP tokenizer which leverages byte-level BPE. This tokenizer enables scenarios like StableDiffusion
May be dependent on https://github.com/dotnet/machinelearning/issues/6992.
Reference:
h…
-
# 🐛 bug report
When encountering a .scss file included in pug, Parcel seems to be using the css tranformer, instead of the scss tranformer
```🚨 Build failed.
@parcel/transformer-css: Unex…
-
### Feature Checklist
- [X] Searched the [issues page](https://github.com/e2nIEE/pandapower/issues) for similar reports
- [X] Read the relevant sections of the [documentation](https://pandapower…
-
**Describe the bug**
I am following the [transformer tutorial ](https://docs.e3nn.org/en/latest/guide/transformer.html) but I received an incorrect plot for the smoothness check.
**To Reproduce**…
-
### Add Link
- [Source code permalink](https://github.com/pytorch/tutorials/blob/653719940f7c4d908811da415f190465d8c3189d/advanced_source/ddp_pipeline.py#L175)
- [Online docs link](https://pytorch.o…