tinybert Search Results

245 results
for tinybert

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

deepset-ai/FARM #849

Error reporting using other pre training models

Hello! I will make some mistakes after modifying the training model For example, xlnet model: AttributeError: 'XLNetModel' object has no attribute 'output_hidden_states' And tinybert model: Val…

kakanixiya updated 2 years ago
2
huawei-noah/Pretrained-Language-Model #185

AutoTinyBERT models not accessible

Thanks for the awesome work on [AutoTinyBERT](https://aclanthology.org/2021.acl-long.400.pdf)! We would like to use your final model checkpoints. However, the links provided in the [AutoTinyBERT …

AdityaKane2001 updated 2 years ago
2
princeton-nlp/CoFiPruning #4

About comparsion with other baseline

Nice work! I have two questions: 1) why report the GLUE dev set results only? 2) Some strong baselines are not compared, such as NasBERT BERT-EMD.

wutaiqiang updated 2 years ago
2
deepset-ai/haystack #611

Introduce QueryClassifier

**Is your feature request related to a problem? Please describe.** With the new flexible Pipelines introduced in https://github.com/deepset-ai/haystack/pull/596, we can build way more flexlible and c…

tholor updated 1 year ago
44
facebookresearch/fairseq #1531

🚀 Transformer Decoder should be able to return attention wei…

## 🚀 Feature Request We should be able to retrieve the attention weights for any layer in Transformer, not only the last one. ### Motivation Currently the Transformer Decoder can return the wei…

astariul updated 2 years ago
2
deepset-ai/haystack #2197

Distilling RoBERTa-base

Distilling RoBERTa using the approach described in the TinyBERT paper. The results of #2019 suggest that it makes more sense to proceed with a base model of RoBERTa. The Pile dataset can be used for t…

MichelBartels updated 2 years ago
2
deepset-ai/haystack #2019

Distilling BERT-Large

As a next step to distilling better language models, we want to explore the difference between a distilling from a base model and a large model. For this, we would need to decide on a dataset: - E…

MichelBartels updated 2 years ago
2
Lisennlp/TinyBert #5

请问teacher和student的hidden size不一样在算MSE的时候是怎么处理的？

lbe0613 updated 2 years ago
2
deepset-ai/haystack #2429

If you fine tune a dense model do you need to update the emb…

**Additional context** Seeing as updating the embeddings in dense models is computational and time expensive, I was thinking of the feasibility of this approach. If I index new documents I don't need …

HighDeFing updated 2 years ago
3
cheneydon/efficient-bert #1

Loading EfficientBert

Hi, thanks for providing this training code and the pretrained model. But how do you load the model in pytorch? In your test.py you only do tests on tinybert, roberts, etc but don't load EfficientBer…

jbgruenwald updated 2 years ago
3

上一页 1...12 13 14 15 16 17 18...25 下一页

245 results for tinybert

245 results
for tinybert