Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Tried to run the toy example on Azure, and I believe I made it all the way through training on the generated. My logs abruptly cut off so not sure on the full error. But am wondering if this is the culprit:
WARNING [root._load_auto_model:789] No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/distilbert-base-uncased. Creating a new one with MEAN pooling.
Azure ML can only write to an Outputs folder-wondering if that's the issue? Am guessing this is included in the Beir data loader, though I couldn't find the actual code to this warning.
Tried to run the toy example on Azure, and I believe I made it all the way through training on the generated. My logs abruptly cut off so not sure on the full error. But am wondering if this is the culprit:
WARNING [root._load_auto_model:789] No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/distilbert-base-uncased. Creating a new one with MEAN pooling.
Azure ML can only write to an Outputs folder-wondering if that's the issue? Am guessing this is included in the Beir data loader, though I couldn't find the actual code to this warning.
Training code:
Logs: