Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
This is more a question than an issue.
I just run a training using gpl_steps as 50000. In the output folder there was created five folders (10000,20000,30000,40000,50000).
Does the pytorch_model.bin that is in the root level of the folder encompasses all the knowledge obtained during the train and I can use just it?
Do we have to keep the intermediate folders?