-
**Describe the bug**
Model I am using UniLM:
I use the following code to load the model.
```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("micro…
-
* **I'm submitting a ...**
[x] bug report
[ ] feature request
[ ] question about the decisions made in the repository
[x] question about how to use this project
* **Summary**
I'm trying to m…
-
代码:model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
运行报错缺少配置文件
ValueError: Unrecognized model in XXX. **Should have a `model_type` key in its config.json**, or contain one of t…
-
Hi, I just follow your architecture and run the code based on https://github.com/Toshihiro-Ota/decision-mamba. But the training time is unacceptable, one epoch needs 8 hours. Do you have any suggestio…
-
**Seems to be related to 7z** and the errors are gone with new 7zip installation.
_The portable standalone build for Windows package downloaded zipped with 7-zip returned error on my Korean Windo…
-
Hello, I recently started studying language modeling and GPT(-2) in particular. While I start to understand the way it is trained/fine-tuned, I do have some questions about its architecture.
In Op…
-
I think it'd be nice to have some transformers that work on dask and numpy arrays, & dask and pandas DataFrames. This would be good since
1. We can depend on dask and pandas, scikit-learn can't
2.…
-
When I did the work to add tagging in #126 I started with an in-document solution, but ultimately decided to kiss and use just a text box in the header, then do the rest of the integration. To some ex…
-
**Describe the bug**
AdamW implementation (see [here](https://github.com/NVIDIA/apex/blob/a7de60e57f0534266841e1733262601ad76aaa74/csrc/multi_tensor_adam.cu#L333)) does not truly decouple the weight…
-
## 🚀 Feature
Lighting profilers generates summaries which are important for analysing the code execution and find bottleneck.
However, it might be useful for users to make metrics available, s…