知识蒸馏 - Githubissues

SISTMrL commented 4 years ago

为何last.pt的模型相对于其他存储下来的模型所占的内存那么小?在使用知识蒸馏时，教师模型是best.pt, 学生模型是last.pt。我正常训练时epoch为60，last.pt和best.pt的map相差不了多少，为何知识蒸馏时要这样设置教师和学生模型

SpursLipu commented 4 years ago

问题一：last.pt的模型没存储训练过程的信息，比如epoch，优化器等。问题二：知识蒸馏需要使用不同网络结构的模型，你可以尝试用不同结构蒸馏

SISTMrL commented 4 years ago

谢谢你的耐心解答，也就是说，我一开始是训练的yolov3 ，得到了best.pt，因为在执行蒸馏时，我的t_cfg和t_weights分别是yolov3.cfg和best.pt。然后比如将yolov3-tiny当作学生网络，cfg为yolov3-tiny.cfg，weights为yolov3-tiny.weights？然后进行蒸馏，来达到目的。诚盼回答，新手，可能问的比较仔细。谢谢🙏

	刘志超

邮箱：15256956711@163.com |

签名由网易邮箱大师定制

在2020年05月31日 17:49，SpursLipu 写道：

问题一：last.pt的模型没存储训练过程的信息，比如epoch，优化器等。问题二：知识蒸馏需要使用不同网络结构的模型，你可以尝试用不同结构蒸馏

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

SpursLipu / YOLOv3v4-ModelCompression-MultidatasetTraining-Multibackbone

知识蒸馏 #21