self-knowledge-distillation Search Results

241 results
for self-knowledge-distillation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

baaivision/EVA #25

Distillation of large ViT

Hello, In your opinion, what is the best way to distill a large vision transformer (eg. ViT-g) into a small one (eg. ViT-B) ? Seems there are many alternatives: MIM as for EVA, distillation toke…

SimJeg updated 1 year ago
16
MIRALab-USTC/GraphAKD #3

Cannot reproduce the KD results

Dear authors, Thank you for your excellent work. However, I have some problems reproducing your experimental results for the baseline KD [21]. The result of KD on Cora is 77.63% rather than 83.2%. …

fengh16 updated 1 year ago
2
netspractice/advanced_gnn #1

About lab_knowledge_distillation

The performance for LSP in lab_knowledge_distillation is much less than results reported in paper. Actually, this descrease is caused by GATConv in DGL. In the source code: ![image](https://us…

wutaiqiang updated 2 years ago
1
huggingface/transformers #876

How to use BERT for finding similar sentences or similar new…

I have used BERT NextSentencePredictor to find similar sentences or similar news, However, It's super slow. Even on Tesla V100 which is the fastest GPU till now. It takes around 10secs for a query tit…

Raghavendra15 updated 1 year ago
163
DIYgod/RSSHub #13133

ACS期刊经常性报错

### 路由地址 ```routes /acs/journal/:id ``` ### 完整路由地址 ```fullroutes acs/journal/accacs acs/journal/jacsat ``` ### 相关文档 https://docs.rsshub.app/zh/routes/journal#american-chemistry…

lychichem updated 1 year ago
10
princeton-nlp/CoFiPruning #34

layer-distillation: teacher layer sets selection?

The original papers mentioned: `` Specifically, let T denote a set of teacher layers that we use to distill knowledge to the student model.'' And the code in trainer provides ``[2, 5, 8, 11]'' only, …

zhangzhenyu13 updated 2 years ago
4
PaddlePaddle/Paddle #44501

【飞桨论文复现挑战赛（第七期）】榜单

hi，大家好，非常高兴的告诉大家，百度飞桨论文复现赛第七期已经开始了，本次论文复现赛共将有100+篇的经典&前沿论文供大家复现。同时飞桨特色模型挑战赛持续展开，详细信息可以参考[AI Studio 链接](https://aistudio.baidu.com/aistudio/competition/detail/406/0/introduction)，大家是否已经迫不及待了呢~ 为了帮助大…

paddle-lwfx updated 1 year ago
224
WongKinYiu/yolov7 #227

mAP performance compared to YOLOR

Hi, compared to the newest performance of yolor, yolov7 is still a little lower on mAP. Tested on img-szie=1280, yolor-d6 mAP is 58.2%, and yolov7-e6e mAP is 56.8%. Are there some strong training tr…

Splendon updated 2 years ago
2
SOOJEONGKIMM/Paper_log #12

DistilBERT, a distilled version of BERT: smaller, faster, ch…

https://arxiv.org/abs/1910.01108

SOOJEONGKIMM updated 2 years ago
11
lucidrains/PaLM-rlhf-pytorch #6

Encoder-Decoder

The follow-up research from PaLM switched in Flan-PaLM to the encoder-decoder t5 architecture. How would it be possible to also add an encoder to this implementation?

Bachstelze updated 1 year ago
39

上一页 1...14 15 16 17 18 19 20...25 下一页

241 results for self-knowledge-distillation

241 results
for self-knowledge-distillation