self-knowledge-distillation Search Results

241 results
for self-knowledge-distillation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

yzd-v/cls_KD #18

I don't understand where "Normalized" is reflected or specia…

Commonly, the original KD loss normalizes the student and teacher logit to class probability before calculating the KL divergence, such as `ori_kd = F.kl_div(F.log_softmax(logit_s), F.softmax(logit_…

gcfengxu updated 3 months ago
2
fly51fly/aicoco #5

爱可可老师一周论文精选

fly51fly updated 6 months ago
106
facebookresearch/Mask-Predict #6

can not reproduce the result on wmt14 en-de

Hi, Thanks so much for releasing your models and data. However, after running the following command, I could only get 9.75 for BLEU 4 on wmt14 ende. python generate_cmlm.py data-bin/wmt14.en-d…

shawnkx updated 4 years ago
20
openai/CLIP #83

CLIP Training Code

Not really an issue, I just want to share my training code since some people still have some difficulties to write the training code. Just modify the code to suit your usage. Feel free to ask or poi…

vinson2233 updated 1 month ago
223
A-suozhang/GetArxivDaily #35

New submissions for Mon, 17 Apr 23

## Keyword: efficient ### End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs - **Authors:** Javier Campos, Zhen Dong, Javier Duarte, Amir Gholami, Michael W. Mahoney,…

A-suozhang updated 1 year ago
1
jungwoo-ha/WeeklyArxivTalk #76

[20230319] Weekly AI ArXiv 만담 시즌2 - 10회차

### News: 아 지난 주 너무 힘든..... - [지난 주 못했던 내용 (죄송합니다)](https://github.com/jungwoo-ha/WeeklyArxivTalk/issues/75) - Conferences - ICML 2023 리뷰, ACL 2023 리뷰 나왔네요 --> 모두들 파이팅! - ICCV 2023 Supplementa…

jungwoo-ha updated 1 year ago
3
FlagOpen/FlagEmbedding #942

How many epochs were trained?

It says that pretraining is 25,000 steps and finetuning is 6000 steps for warm up only. Can I know the number of learning epochs for pretraining and finetuning including warm up? I have taken part of…

daegonYu updated 4 months ago
4
Yuezi-1223/yuezi.github.io #4

Study Note | 20th Century European Art History - Part 1

_Written & Organized by 悦子yuezi Issue Date: 2024/09/06_ #### 1/16 Post-Impressionism **Impressionism** developed in France in the 19C. century and is based on the practice of pain…

Yuezi-1223 updated 1 month ago
1
irthomasthomas/undecidability #654

Representation engineering:

- [ ] [I'm the author of the GPT-2 work. This is a nice post, thanks for making it more... | Hacker News](https://news.ycombinator.com/item?id=39436215) # TITLE I'm the author of the GPT-2 work. Thi…

irthomasthomas updated 8 months ago
1
zchen0420/nn_papers #6

Humanlike behaviors

# ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models 2023 Workshop on Computational Approaches to Subjectivity, Sentiment “oxymoron” Despite being fun to interact …

zchen0420 updated 5 months ago
8

上一页 1...7 8 9 10 11 12 13...25 下一页

241 results for self-knowledge-distillation

241 results
for self-knowledge-distillation