-
Hi, thanks for the great example on training RoBERTa with long attention.
Followed this example: https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb
Was able to s…
-
Hello,
I added a bit of simialr code(adding xlmroberta tokenizer, encoder description) to use xlm-roberta-base from huggingface instead of bert encoder model. Problem is when I trying to train on xlm…
-
Hi,
It seems from the source code that XLM Roberta is finetuned with the gradient updates based on the LSTM attention model. However, when I follow the README instructions and train the model on hi…
-
Would you please provide the sample code particularly to show how to construct a LocalImageGenerator instance?
```
let imageGenerator = LocalImageGenerator(
queue: queue, configurations: conf…
-
![image](https://user-images.githubusercontent.com/45490378/185097604-ad164b3c-8a60-4f49-94a6-dc4268f0a1fb.png)
Why does it need "--model" paramter when I give a specific config? And what does "que…
-
使用经过脚本转换后的huggingface上的mengzi-t5-base模型时报错:
```
RuntimeError: Error(s) in loading state_dict for Model:
size mismatch for embedding.word_embedding.weight: copying a param with shape torch.S…
-
@JRosenkranz This looks amazing, learned about this lib at vllm yesterday. I am trying to run `bge-m3` using this custom modeling code for https://github.com/michaelfeil/infinity . I am aware that thi…
-
你好,我的训练数据量级~10w,我做了以下两组实验:
1. embedding finetune 和 reranker finetune 用同一份数据,前者微调完成后比未微调的通用模型效果好,但后者微调后明显比微调前效果更差
2. 用finetuned embedding model采样难负样本后微调reranker,依旧比微调前效果差
上述两个实验中,reranker收敛正常,评测…
-
I installed **nlptoolkit** package through pip. But the following line is repeatedly giving me an error.
from nlptoolkit.utils.config import Config
I tried upgrading pandas, tqdm as these a…
-
Hi,
I am blocked with low latency response due to tokenizer computation from `stsb-xlm-r-multilingual` model.
Could anyone have an idea on how to get a fast tokenizer for `stsb-xlm-r-multilingua…