-
Getting entites / concepts that are frequent in a topic but not in others.
-
Hi,
Let's start a discussion here about the roadmap towards 0.10 and 1.0. We are looking for:
- New features that are useful to your research
- Improvements and patches to existing features
If…
-
In the original paper, BERT model is fine-tuned on downstream NLP tasks, where the number of instances for each task is in the order of thousands to hundreds of thousands. In my case, I have about 5 m…
-
## Environment info
- `transformers` version: 4.3.2 and Latest version forked from github
- Platform: Linux (Colab env)
- Python version: 3.6
- PyTorch version (GPU?): XLA 1.7
- Tensorflow ve…
-
## Abstract (요약) 🕵🏻♂️
In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is **…
-
Hi,
I modified the quickstart_sst_demo.py example file to allow it to run already fine-tuned models from huggingface without the need to first train it. I had loaded this model https://huggingface.co…
-
# Mini-Report on Construction of a Parallel Corpus of the Federal Gazette Archive
This report describes the construction of the parallel corpus of the Federal Gazette Archive after the article PDF…
-
## Environment info
- `transformers` version: 3.2.0
- Platform: Linux-4.15.0-1091-oem-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- PyTorch version (GPU?): not installed (NA)
…
c-col updated
3 years ago
-
Dear team,
I try to predict punctuation following this tutorial: https://nvidia.github.io/NeMo/nlp/punctuation.html.
I can't define tokenizer using pretrained "bert-base-multilingual-uncased" model.…
-
Several functions of our complexity text assessor rely on external databases (such as giveness with the `MCR` (multilingual central repository), stopwords, lemmatizer, etc. We would like to provide an…