Closed GWwangshuo closed 4 years ago
Hello~!
Please look over the section 4.1. Image Classification, especially Dataset paragraph: "As studied, selecting K-most uncertain samples from such a large pool often does not work well, ....blah blah... We adopt this simple yet efficient scheme and set the subset size to M=10000"
Hello~!
Please look over the section 4.1. Image Classification, especially Dataset paragraph: "As studied, selecting K-most uncertain samples from such a large pool often does not work well, ....blah blah... We adopt this simple yet efficient scheme and set the subset size to M=10000"
That makes sence. It turns out selecting K-most uncertain samples giving worse performance. Thanks for your clarifying. Moreover, In my experiment, the phenonmenon that sampling by the ground truth loss performs worse than sampling with by learning loss is also related to this.
Thanks for you implementation. I attempted to run your code and noticed in
main.py
, you firstly shuffle the unlabeled set and select 10000 unlabeled data points rather than the whole unlabeled data points.My understanding is that the sampling should happen in the
whole unlabeled data points
rather thanpart of the whole unlabeled data points
. Am I correct? You do this just for fast training or there are something behind this operation? Have you tried selecting samples on the whole data points? Do they give the similar performance? Thanks.