-
I want use it on Android Or iOS, but 96M file obviously too big for mobile app. How can i do it let model less more then 60M?
-
## how good or bad are the predicted probabilities??
low probability --> high penalty
-log(1.0) = 0
-log(0.8) = 0.22314
-log(0.6) = 0.51082
y = -log(x)
![image](https://user-images.githubuserc…
jl749 updated
2 years ago
-
# Thanks for your valuable comments.
We thank AC and all reviewers for their valuable time and constructive feedback. We briefly summarize the suggestions and our answers as the following. You can …
-
大佬,请问一下,我用pytorch复现了你的论文,但是训练的时候D loss 数值很大,而且loss也不收敛,不知道是什么情况
GAN网络采用的是wgan,加了gp loss,用了6个D net结构和大佬论文里的一样。ssd优化方法是SGD,Dnet是用Adam
用单卡训练的,batch size=8
-
Hi,
i had problems with the digit model on a electrical meter. So I looked deeper in the code and saw, the models will be trained often with the history and stored model. This makes a much overfitt…
-
## Background
- 큰 모델(teacher)에서 작은 모델(student)로 지식을 전이하여 모델 경량화나 추론 속도 개선을 이루면서도 성능을 유지하거나 향상시키는 기법
- KD는 큰 모델의 성능을 작은 모델에 전이하기 때문에, student 모델이 경량화되어 더 빠른 추론 속도를 제공하면서도 성능 저하를 최소화
- Object Detecti…
-
Inspired by < Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation > I expanded the original English-only model to support Chinese,I use the script - make_multilingual.py,a…
-
-
Hi,
Your work is amazing! I'm wondering whether you are going to release the code for :1. Data-free pruning 2. Data-free Knowledge Transfer 3. Data-free Continual Learning.
Thanks!
-
Hi wentianli
I've been testing the knowledge distillation method for a while by playing with Caffe's available layers and I was able to achieve nearly good results with some simple models. It's bee…