google-research / mixmatch

Apache License 2.0
1.13k stars 163 forks source link

What are the most important things to reproduce the result on my own dataset? #27

Closed guotong1988 closed 4 years ago

guotong1988 commented 4 years ago

UDA(https://github.com/google-research/uda) could achieve good accuracy by only 20 training data on text classification. But I find it is hard to reproduce the result on my own dataset.

So I want to know the reason why UDA or MixMatch works. And I want to know what is the most important thing to reproduce the result.

@david-berthelot Thank you very much.

carlini commented 4 years ago

We didn't replicate any of the UDA experiments on text classification. For help getting that method to work it might useful to reach out to the authors of that paper (https://github.com/google-research/uda).

Reproducing our results should be fairly straightforward. If you clone the repository, and follow the README, then the code we used to generate the tables in the paper sits in mixmatch/runs/.

As for why it works: I don't know that I have anything to add on top of what's in the paper.

guotong1988 commented 4 years ago

Thank you. I think to make unlabel data working on my own text classification dataset, I should do more and more experiments myself, based on your paper.