UKPLab / gpl

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Apache License 2.0
315 stars 39 forks source link

Model not saved #32

Closed iamyihwa closed 1 year ago

iamyihwa commented 1 year ago

After running following script the model seem not to be saved. Script run: `!python -m gpl.train \ --path_to_generated_data "generated" \ --output_dir "output" \ --new_size -1 \ --queries_per_passage -1

`

Final output: Iteration: 100% 139995/140000 [2:39:11<00:00, 14.71it/s] Iteration: 100% 139997/140000 [2:39:11<00:00, 14.73it/s] Iteration: 100% 140000/140000 [2:39:12<00:00, 14.66it/s] Epoch: 100% 1/1 [2:39:12<00:00, 9552.27s/it]

Where as in the example in the collab model seems to get saved (link)

[2022-06-27 21:55:11] INFO [sentence_transformers.SentenceTransformer.save:352] Save model to output/fiqa [2022-06-27 21:55:12] INFO [sentence_transformers.SentenceTransformer.save:352] Save model to output/fiqa/100

iamyihwa commented 1 year ago

Running the exact same command fresh seems to work.

For some reason, the files that generate files needed for training ('generated'), and output of the model ('output') needs to be run separately? Not sure..

Since the issue is solved, will close the issue.

image