-
**Describe the bug**
Even for 100 documents training is taking more than expected time like 20+hrs for xlm-roberta-large as model_type, Is there any workaround for this to work
-
## Details
I am attempting to fine tune an XLMRoberta sequence classification model. I have an array of text snippets from physicians labelled 1-8 with various diagnostic indications. I've created …
-
## ❓ Questions/Help/Support
In the cifar10 example when a [model](https://github.com/pytorch/ignite/blob/master/examples/contrib/cifar10/main.py#L51) is defined inside the process, is the model upd…
-
To Author:
Currently our company is trying to shorten the inference time by using "cpu", our team member just have an idea that to put intel OpenVino and haystack together. But the question is, is th…
-
Hi guys,
I'm trying to use a further train XLMRoBERTa and I got the following error :
```
AssertionError: Vocab size of tokenizer 250002 doesn't match with model 250005. If you added a custom vo…
-
Hi,
I am training an XLMRoberta model from scratch on Hindi. I am using a sentencepiece tokenizer trained exclusively on monolingual data following the steps mentioned in the [tokenizers repository](…
-
Hi
Thanks a lot for your hard work. Can you add support `XLM-RoBERTa` too? Thanks
-
Hi Nils,
I got an error when the input texts are larger than 512. The problem is solved after I modified L.65 in XLMRoBERTa.py to "pad_seq_length = min(pad_seq_length, self.max_seq_length) + 2".
-
I am studying machine reading comprehension on xlmroberta.
My data is korquad.
I need to tokenize all word to character.
e.g. by english
This is a dog
-> _T h i s _i s _a _d o g
please let …
-
xlmroberta is not available for multilabel. Not sure if this is a bug or a missing feature.