-
I am curious, why hasn't RAdam been included official in pytorch?
https://github.com/pytorch/pytorch/issues/24892
-
It would be great to have a nice notebook explaining TransformerLM and maybe even full Transformer in models/ -- both to explain the code and if possible with illustrations clarifying the concepts.
-
Greetings!
Firstly, I'd like to say I'm very happy to get to learn about this library. Hopefully I'll try to add some contributions in the future.
---
I'm trying to solve an electromagnetism …
-
I am trying to **fine-tune Tapas** following the instructions here: https://huggingface.co/transformers/v4.3.0/model_doc/tapas.html#usage-fine-tuning , Weak supervision for aggregation (WTQ) using the…
-
### Your current environment
The output of `python collect_env.py`
```text
python collect_env.py
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
C…
-
**Describe the bug**
I am training a librispech transformers ASR model by using the recipe in /espnet/egs2/librispeech/asr1/ dir
cat conf/train_asr_transformer.yaml
batch_type: numel
batch_bins:…
-
## ❓ Question
I am within the `nvcr.io/nvidia/pytorch:23.09-py3` container. Trying out some snippets from:
https://youtu.be/eGDMJ3MY4zk?si=MhkbgwAPVQSFZEha.
Both JIT and AoT examples failed. F…
-
- Currently ALUs only supports WGS84 projection by default. At least UTM projection (by zone) is needed. This concerns SAR processors only.
- Currently resampling supports it, need to refactor and cre…
-
I noticed a change that was introduced in the `MaskedLMHead` layer, and it broke my entire workflow. Earlier we had the signature for `MaskedLMHead` like this:
```python
out = keras_nlp.layers.Mas…
-
### When did you clone our code?
I cloned the code base after 5/1/23
### Describe the issue
Issue: When I use deepspeed zero3 to pretrainning LLaVA-13B on 4 * A100(40G),I got an error shows below. …