Task-agnostic use of unlabeled data:
Task-specific use of unlabeled data:
主要分为三种:Self-Training,pseudo-labeling[4]以及label consistency regularization[4]
Learning from just a few labeled examples while making best use of a large amount of unlabeled data is a long-standing problem in machine learning.
Aside from the representation learning paradigm, there is a large and diverse set of approaches for semi-supervised learning, we refer readers to *** for surveys of classical approaches.
6 Reference
1.Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirec-tional transformers for language understanding.arXiv preprint arXiv:1810.04805, 2018.
Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba,and Sanja Fidler. Skip-thought vectors. InAdvances in neural information processing systems, pages3294–3302, 2015.
Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets.Neural computation, 18(7):1527–1554, 2006.9
Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D Cubuk, AlexKurakin, Han Zhang, and Colin Raffel. Fixmatch: Simplifying semi-supervised learning with consistencyand confidence.arXiv preprint arXiv:2001.07685, 2020.
大样本数据上的无监督预训练->小样本上的有监督finetune得到Teacher模型->使用Self-Training/Distillation知道小模型。 将SOTA结果提升了大约22个点左右,充分展示了半监督训练的潜力。
1 学习到的新东西:
(CV领域)越深projection layer效果可能更好,可以学习到更好的表示。
2 通过Related Work了解到了哪些知识
Task-agnostic use of unlabeled data: 最典型的就是在NLP中的应用[1,2]以及对比学习[3] Task-specific use of unlabeled data: 主要分为三种:Self-Training,pseudo-labeling[4]以及label consistency regularization[4]
3 实验验证任务,如果不太熟悉,需要简单描述
4 在你认知范围内,哪些其它任务可以尝试
5 好的句子
Learning from just a few labeled examples while making best use of a large amount of unlabeled data is a long-standing problem in machine learning. Aside from the representation learning paradigm, there is a large and diverse set of approaches for semi-supervised learning, we refer readers to *** for surveys of classical approaches.
6 Reference
1.Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirec-tional transformers for language understanding.arXiv preprint arXiv:1810.04805, 2018.