knowledge-distillation Search Results

1000+ results
for knowledge-distillation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UKPLab/sentence-transformers #507

model distillation

hi, can I use "knowledge distillation" and "dimension reduction" for Bert-large? and if it is possible, for knowledge distillation how many layers should be remained in option2 ? and for dimension …

ReySadeghi updated 4 years ago
6
extreme-assistant/CVPR2024-Paper-Code-Interpretation #86

Please add our Oral paper on Knowledge Distillation of 3D ob…

Could you help to add the paper the list? Paper (Oral): Boosting 3D Object Detection by Simulating Multimodality on Point Clouds Paper Link: https://arxiv.org/abs/2206.14971 Thanks!

Vegeta2020 updated 2 years ago
1
YoojLee/paper_review #85

Dinomaly: The Less Is More Philosophy in Multi-Class Unsuper…

![image](https://github.com/user-attachments/assets/dcac863e-0062-4f2d-86f0-52415810dbcc) ## Summary Dino 방식으로 학습된 transformer 아키텍처를 잘 활용하면, 아주 간단하게 multi-class anomaly detection을 수행할 수 있다. 1) Noi…

YoojLee updated 2 months ago
1
haitongli/knowledge-distillation-pytorch #10

An issue on loss function

I suggest both training loss function without KD and with KD should add a softmax function, because the outputs of models are without softmax. Just like this. https://github.com/peterliht/knowledge-d…

lhyfst updated 2 years ago
4
OliverRensu/TinyMIM #8

finetune with DeiT style

Can you provide the details of the model is fine-tuned for 1000 epochs with DeiT-style knowledge distillation? Thanks!

dulibubai updated 1 year ago
1
FlagOpen/FlagEmbedding #1170

Found and fixed a BUG when not using knowledge_distilation i…

Hello, @545999961. I was fine-tuning bge-m3, and found a bug when not using `knowledge_distilation` parameter. This was my training script: ``` CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --np…

yjoonjang updated 3 weeks ago
3
UCDvision/sima #1

value about Temperature

Hi, I have a problem about: what is the valu of T or self.tau do you have choiced? or How can i set the value while training my project? T = self.tau # taken from https://git…

coallar updated 2 years ago
1
xinghaochen/TinySAM #18

Is the mask decoder weight inherited from the teacher models…

If so, in the full-stage knowledge distillation, the image encoder is randomly initialized, is the mask decoder finetuned at a smaller learning rate than the light weight image encoder? Is this consis…

Vickeyhw updated 9 months ago
1
fulfulggg/Information-gathering #334

ハイブリッド段階的蒸留スパイキングニューラルネットワークを用いた低遅延イベントベース視覚認識に向けて

## タイトル: ハイブリッド段階的蒸留スパイキングニューラルネットワークを用いた低遅延イベントベース視覚認識に向けて ## リンク: https://arxiv.org/abs/2409.12507 ## 概要: スパイクニューラルネットワーク(SNN)は、その低消費電力と高い生物学的解釈可能性から大きな注目を集めています。豊富な時空間情報処理能力とイベント駆動型という性質から、ニューロ…

fulfulggg updated 2 months ago
2
Xiaobin-Rong/gtcrn #39

有没有计划支持AEC

1.非常好的项目，请问作者有没有计划支持GTCRN的回声消除模型。

hfwanguanghui updated 3 weeks ago
6

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for knowledge-distillation

1000+ results
for knowledge-distillation