tinybert Search Results

246 results
for tinybert

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huawei-noah/Pretrained-Language-Model #32

[TinyBert] ERROR, runing the task_distill during task-specif…

The ERROR happened during task-specific distill, Traceback is in the END. Fine-turn Bert model was generated using [transformer package](https://github.com/huggingface/transformers#quick-tour-of-the-f…

vigosser updated 3 years ago
1
huawei-noah/Pretrained-Language-Model #36

Task-specific Distillation 的step1中load_tf_weights_in_bert出现下…

Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'bias'] Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'kernel'] Initialize…

wanghia updated 3 years ago
2
huawei-noah/Pretrained-Language-Model #38

Isn't TinyBert Equivalent to copying teacher Bert attention/…

Looking at Eq 7-9 in the paper (https://arxiv.org/pdf/1909.10351.pdf) and assuming that the student and teacher models have the same dimensionality (i.e. d=d') then how is TinyBert any different (bett…

osaleh updated 3 years ago
1
deepset-ai/haystack #908

Retriever: Why does "Creating Embeddings" take so long?

I'm running a search using DensePassageRetriever. It takes 10+ seconds to run each query. The message I'm shown is "Creating Embeddings". I'm confused because the embeddings for my documents are…

leoplusx updated 3 years ago
5
huawei-noah/Pretrained-Language-Model #24

tinybert预训练蒸馏两个问题

1. 预训练蒸馏只有attention和encoder_layer loss, 好像没有mask lm的loss？ 2. 如果没有mask lm的loss, 怎么直接测试蒸馏好的小模型效果？

qgzang updated 3 years ago
10
huawei-noah/Pretrained-Language-Model #52

will you please share the hyper-parameter for RTE distill?

Hi, I used the default hyper-parameter in TinyBERT repo, and the result on RTE is 30.7 on dev, and 28.6 on test, much far from the results in the paper. So will you please share the hyper-parameter f…

1024er updated 3 years ago
2
huawei-noah/Pretrained-Language-Model #64

TinyBert中文模型什么时候发布大概？

1、请问什么时候发布TinyBert中文模型呢？ 2、如果没有general distill阶段，直接随机化参数进行task-specific，不知道效果如何（这样的话大概需要什么量级的数据量）？

DeligientSloth updated 3 years ago
3
huawei-noah/Pretrained-Language-Model #49

蒸馏的效果问题？

原始的教师网络通过fine-tune后的准确率大概在93%，使用大量未打标签数据输入到教师网络获取打标签数据，将这些数据输入到四层的bert（作为学生网络）中训练，以下两种情况：（1）未添加中间层loss（atten、embebeding、encoder等），仅仅采用学生的硬标签作为loss，准确率为89%；（2）添加中间层loss蒸馏，准确率为90%。这说明中间层loss对…

Zjq9409 updated 3 years ago
2
huawei-noah/Pretrained-Language-Model #35

General_TinyBERT 模型可以提供 tensorflow版本吗？

我看模型文件都是 pytorch的，就想问问能不能把tensorflow版本的也帮忙训一下，谢谢！

sanshanxiashi updated 3 years ago
10
bojone/bert4keras #281

使用ernie模型训练问题

提问时请尽可能提供如下信息： ### 基本信息 - 你使用的**操作系统**: linux - 你使用的**Python**版本: python3.7 - 你使用的**Tensorflow**版本: 1.14 - 你使用的**Keras**版本: 2.2.5 - 你使用的**bert4keras**版本: 0.4.3 - 你使用纯**keras**还是**tf.keras**:…

yzq170320 updated 3 years ago
1

上一页 1...18 19 20 21 22 23 24...25 下一页

246 results for tinybert

246 results
for tinybert