-
-
1. I want to do incremental pre-training on the existing RoBERTa. Which RoBERTa model should I use? Download directly from Hugging Face? Do I need to script it into UER format after downloading it? Is…
-
# ❓ Questions and Help
Hi all, I'm trying to load a pretrained XLM-Roberta model from HuggingFace using xformers to examine the potential speed up. To the best of my abilities, I've defined a config …
-
你好,我的训练数据量级~10w,我做了以下两组实验:
1. embedding finetune 和 reranker finetune 用同一份数据,前者微调完成后比未微调的通用模型效果好,但后者微调后明显比微调前效果更差
2. 用finetuned embedding model采样难负样本后微调reranker,依旧比微调前效果差
上述两个实验中,reranker收敛正常,评测…
-
I installed **nlptoolkit** package through pip. But the following line is repeatedly giving me an error.
from nlptoolkit.utils.config import Config
I tried upgrading pandas, tqdm as these a…
-
**Describe the bug**
I want to fine-tune the XLMRoberta model with masked language modeling. So I used the `LanguageModelingModel` class to create a model and trained it with my data (with _simple_ d…
-
Update:
* Please see #6801 for major items in performance sprint.
* Please see #8779 for major items in a new architecture aim at simplicity and performance.
* We are in the feedback gathering pha…
-
首先感谢您提供优质的代码!
如题,我想使用您提供的代码实现英文BART的预训练,需要找您确认以下事项:
1)数据预处理的格式
项目提供的样例如下:
所以英文应该采用相同的格式,即每句话占一行,每段文本之间用空行隔开。
2)数据预处理及模型预训练的分词器的选择
在处理英文文本时,直接使用默认的BertTokenizer就可以了吗?还是要使用XLM-Roberta的分词器?
如果我要…
-
Hey,
Congratulations on the impressive results and thank you for open-sourcing the work! 🤗
I have a question, do you also plan to implement Longformer for XLM-R because cross-lingual NLP with lo…
-
Hello! I am intersted in your paper. But I am wondering how do you combine the explainer and MTQ or XMS to get a score. Can you tell me some details. Thank u.