I noticed there is a difference between your implementation and the original MXNET implementation. A validation set is used in your implmentation, while the original implementation didn't use.
Since there are only a few labeled samples in target training set for few-shot SDA problem, it seems the validation set will be unavailable in real scenario. I think the original implementation is more reasonable.
I noticed there is a difference between your implementation and the original MXNET implementation. A validation set is used in your implmentation, while the original implementation didn't use.
Since there are only a few labeled samples in target training set for few-shot SDA problem, it seems the validation set will be unavailable in real scenario. I think the original implementation is more reasonable.