-
Hello, I'm Hady an ECE student at cairo university school of engineering, I've been working on a distilled version of a text summarization model called pegasus, I found your L3-AI talk on YouTube and …
-
## 简介
跟mutual learning差不多,不一样的是mutual learning是many-to-many的学,这里是先通过many构造出一个ensemble model,再用这个ensemble去教many。教的过程用了根据teacher是否足够好进行自适应的distillation,也是很常见的操作。
## 论文信息
* Author: Baidu
* [Paper](…
-
Hi,
I want to try fastspeech on different dataset. therefore, can you share how to extract alignment from tacotron2?
I tried this code, but get bad result for synthesis when inference long sent…
-
### Paper
FedHe: Heterogeneous Models and Communication-Efficient Federated Learning
### Link
https://arxiv.org/abs/2110.09910
### Maybe give motivations about why the paper should be implemented …
-
hi, i want to train a NAT model for zh-en (about 260k) . I get about 30 BLEU on teacher model , but always overfit on student model
There are the following scripts:
zh-en preprocessing:
`fairse…
-
Thank you for your training code and dataset. I have been using your dataset and training code for training, and it took a few days to train the model up to 30 epochs. However, the train loss and val …
-
https://virtual2023.aclweb.org/paper_P5706.html
-
你好,我使用keras重新写了模型并进行训练,使用insightface的resnet100模型作为teacher提取特征,使用softmax-交叉熵 和embedding 的 L2 loss,交叉熵loss大约在12左右,L2 loss在0.0038,所以我把L2 loss *2000,训练10epoch ,但是L2 loss下降很慢,只下降到0.0028。
请问你们训练的时候要训练多少个…
-
HI,
I am trying to follow your instruction to match the result of the paper usubg NYU dataset. But the mIOU and RMSE are still can not be the same. They get stable after 300 iterations and stop at …
-
This may be a bit off-topic not sure though. So I am trying to do face recognition on ESP-CAM with 4MB flash.
At the moment the size of weights file for this model is **8MB** so I am not able to pu…