-
I've been seeing in the mt5 tokenizer the extra 100 ids needed for the sentinel tokens come already included in the tokenizer.
`'gs://t5-data/vocabs/mc4.250000.100extra/sentencepiece.model'`
how…
-
**Describe the bug**
If I try to make a T5 model learn Japanese, I will get garbled results when you predict.
**To Reproduce**
This is the source code.
```
import logging
import pandas as p…
-
I see that the mt5 tokenizer comes by default with the extra 100 ids that are needed by the T5 models.
I've trained my own version of tokenizer following the official sentencepiece documentations.
…
-
git branch:https://github.com/Oneflow-Inc/oneflow/pull/9245
安装方法:
python3 -m pip install --pre oneflow -f https://staging.oneflow.info/branch/release/compile_cost_cnt/cu112
下面是开了 GLOG_v=1 时…
-
If you haven’t already, check out our [contributing guidelines](https://github.com/Expensify/ReactNativeChat/blob/main/contributingGuides/CONTRIBUTING.md) for onboarding and email contributors@expensi…
-
@craffel , hey there
In the paper at page 3 under 3.2 mT5 section you guys mentioned a data sampling technique that helps to maintain a good balance between low and high resource languages (that pr…
-
## Problem
We recently started returning an `errors` object in `onyxData` from the API that contain the latest error related to a form submission. The current `Form` component does not take into acco…
-
This is a very interesting dataset and the baseline and evaluation code are very helpful.
I have a clarification question as the input path from the run scripts does not match exactly the structur…
-
### System Info
```shell
- `transformers` version: 4.18.0
- Platform: Linux-4.14.252-131.483.amzn1.x86_64-x86_64-with-glibc2.9
- Python version: 3.6.13
- Huggingface_hub version: 0.4.0
- PyTorch …
-
Hi .
Hello @AminHP, how are you?
I've been studying for that time and searching more and more about this world of trade and more and more, going to daytrade (intraday).
In this case, for this pr…