UKPLab / gpl

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Apache License 2.0
315 stars 39 forks source link

Support for Azure? #16

Open junefeld opened 2 years ago

junefeld commented 2 years ago

Tried to run the toy example on Azure, and I believe I made it all the way through training on the generated. My logs abruptly cut off so not sure on the full error. But am wondering if this is the culprit:

WARNING [root._load_auto_model:789] No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/distilbert-base-uncased. Creating a new one with MEAN pooling.

Azure ML can only write to an Outputs folder-wondering if that's the issue? Am guessing this is included in the Beir data loader, though I couldn't find the actual code to this warning.

Training code:

import sys, os, joblib
import gpl

# Save the result to the outputs folder
os.makedirs("outputs", exist_ok=True)

dataset = 'fiqa'

gpl.train(
    path_to_generated_data = "generated/" + dataset,
    base_ckpt = "distilbert-base-uncased",
    gpl_score_function = "dot", 
    batch_size_gpl = 4, 
    gpl_steps = 100, 
    new_size = 10, 
    queries_per_passage = 1,
    output_dir = "outputs/" + dataset,
    evaluation_data = "./" + dataset, 
    evaluation_output = "evaluation/" + dataset,
    generator = "BeIR/query-gen-msmarco-t5-base-v1",
    retrievers = ["msmarco-distilbert-base-v3", "msmarco-MiniLM-L-6-v3"], 
    retriever_score_functions = ["cos_sim", "cos_sim"], 
    cross_encoder = "cross-encoder/ms-marco-MiniLM-L-6-v2", 
    mnrl_output_dir = None,
    mnrl_evaluation_output = None,
    qgen_prefix = "qgen",
)

Logs:


/azureml-envs/azureml_ec637423e82cc698715575ac22b521b8/lib/python3.6/site-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
  from cryptography.hazmat.backends import default_backend
2022-06-29 19:52:08.489294: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /azureml-envs/azureml_ec637423e82cc698715575ac22b521b8/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
2022-06-29 19:52:08.489374: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-29 19:52:11 - Loading faiss with AVX2 support.
2022-06-29 19:52:11 - Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'",)
2022-06-29 19:52:11 - Loading faiss.
2022-06-29 19:52:11 - Successfully loaded faiss.
[2022-06-29 19:52:12] INFO [gpl.train.train:125] No generated queries found. Now generating it
[2022-06-29 19:52:12] INFO [beir.datasets.data_loader.load_corpus:89] Loading Corpus...

  0%|          | 0/10 [00:00<?, ?it/s]
100%|██████████| 10/10 [00:00<00:00, 41486.69it/s]
[2022-06-29 19:52:12] INFO [beir.datasets.data_loader.load_corpus:91] Loaded 10 Documents.
[2022-06-29 19:52:12] INFO [beir.datasets.data_loader.load_corpus:92] Doc Example: {'text': "I'm not saying I don't like the idea of on-the-job training too, but you can't expect the company to do that. Training workers is not their job - they're building software. Perhaps educational systems in the U.S. (or their students) should worry a little about getting marketable skills in exchange for their massive investment in education, rather than getting out with thousands in student debt and then complaining that they aren't qualified to do anything.", 'title': ''}

Downloading:   0%|          | 0.00/1.81k [00:00<?, ?B/s]
Downloading: 100%|██████████| 1.81k/1.81k [00:00<00:00, 1.58MB/s]

Downloading:   0%|          | 0.00/1.35k [00:00<?, ?B/s]
Downloading: 100%|██████████| 1.35k/1.35k [00:00<00:00, 1.12MB/s]

Downloading:   0%|          | 0.00/773k [00:00<?, ?B/s]
Downloading: 100%|██████████| 773k/773k [00:00<00:00, 11.1MB/s]

Downloading:   0%|          | 0.00/1.74k [00:00<?, ?B/s]
Downloading: 100%|██████████| 1.74k/1.74k [00:00<00:00, 1.59MB/s]

Downloading:   0%|          | 0.00/850M [00:00<?, ?B/s]
Downloading:   0%|          | 2.39M/850M [00:00<00:35, 25.1MB/s]
Downloading:   1%|          | 6.66M/850M [00:00<00:24, 36.6MB/s]
Downloading:   1%|▏         | 11.0M/850M [00:00<00:21, 40.6MB/s]
Downloading:   2%|▏         | 15.1M/850M [00:00<00:20, 41.8MB/s]
Downloading:   2%|▏         | 19.1M/850M [00:00<00:20, 41.6MB/s]
Downloading:   3%|▎         | 23.1M/850M [00:01<00:51, 17.0MB/s]
Downloading:   3%|▎         | 26.0M/850M [00:01<00:46, 18.4MB/s]
Downloading:   4%|▎         | 29.8M/850M [00:01<00:38, 22.4MB/s]
Downloading:   4%|▍         | 33.8M/850M [00:01<00:32, 26.4MB/s]
Downloading:   4%|▍         | 37.4M/850M [00:01<00:29, 29.1MB/s]
Downloading:   5%|▍         | 41.5M/850M [00:01<00:26, 32.4MB/s]
Downloading:   5%|▌         | 45.6M/850M [00:01<00:24, 35.1MB/s]
Downloading:   6%|▌         | 49.6M/850M [00:01<00:22, 37.1MB/s]
Downloading:   6%|▋         | 53.6M/850M [00:01<00:21, 38.3MB/s]
Downloading:   7%|▋         | 57.8M/850M [00:01<00:20, 40.0MB/s]
Downloading:   7%|▋         | 61.9M/850M [00:02<00:20, 40.8MB/s]
Downloading:   8%|▊         | 66.0M/850M [00:02<00:19, 41.6MB/s]
Downloading:   8%|▊         | 70.3M/850M [00:02<00:19, 42.5MB/s]
Downloading:   9%|▉         | 74.5M/850M [00:02<00:18, 43.2MB/s]
Downloading:   9%|▉         | 78.7M/850M [00:02<00:19, 42.3MB/s]
Downloading:  10%|▉         | 82.8M/850M [00:02<00:19, 41.8MB/s]
Downloading:  10%|█         | 87.1M/850M [00:02<00:18, 42.8MB/s]
Downloading:  11%|█         | 91.2M/850M [00:02<00:18, 43.0MB/s]
Downloading:  11%|█         | 95.7M/850M [00:02<00:17, 44.0MB/s]
Downloading:  12%|█▏        | 99.9M/850M [00:02<00:17, 43.7MB/s]
Downloading:  12%|█▏        | 104M/850M [00:03<00:18, 42.9MB/s] 
Downloading:  13%|█▎        | 108M/850M [00:03<00:18, 42.7MB/s]
Downloading:  13%|█▎        | 112M/850M [00:03<00:17, 43.3MB/s]
Downloading:  14%|█▎        | 117M/850M [00:03<00:17, 43.5MB/s]
Downloading:  14%|█▍        | 121M/850M [00:03<00:18, 42.4MB/s]
Downloading:  15%|█▍        | 125M/850M [00:03<00:17, 43.0MB/s]
Downloading:  15%|█▌        | 129M/850M [00:03<00:17, 42.7MB/s]
Downloading:  16%|█▌        | 133M/850M [00:03<00:17, 42.8MB/s]
Downloading:  16%|█▌        | 138M/850M [00:03<00:17, 43.6MB/s]
Downloading:  17%|█▋        | 142M/850M [00:03<00:17, 43.5MB/s]
Downloading:  17%|█▋        | 146M/850M [00:04<00:16, 44.2MB/s]
Downloading:  18%|█▊        | 150M/850M [00:04<00:17, 43.1MB/s]
Downloading:  18%|█▊        | 154M/850M [00:04<00:16, 43.0MB/s]
Downloading:  19%|█▊        | 159M/850M [00:04<00:16, 43.5MB/s]
Downloading:  19%|█▉        | 163M/850M [00:04<00:16, 44.0MB/s]
Downloading:  20%|█▉        | 167M/850M [00:04<00:16, 44.3MB/s]
Downloading:  20%|██        | 172M/850M [00:04<00:16, 43.1MB/s]
Downloading:  21%|██        | 176M/850M [00:04<00:15, 44.6MB/s]
Downloading:  21%|██        | 180M/850M [00:04<00:15, 44.8MB/s]
Downloading:  22%|██▏       | 185M/850M [00:05<00:15, 45.7MB/s]
Downloading:  22%|██▏       | 190M/850M [00:05<00:14, 46.4MB/s]
Downloading:  23%|██▎       | 194M/850M [00:05<00:14, 46.4MB/s]
Downloading:  23%|██▎       | 198M/850M [00:05<00:16, 41.8MB/s]
Downloading:  24%|██▍       | 203M/850M [00:05<00:16, 41.6MB/s]
Downloading:  24%|██▍       | 207M/850M [00:05<00:15, 42.5MB/s]
Downloading:  25%|██▍       | 211M/850M [00:05<00:15, 44.2MB/s]
Downloading:  25%|██▌       | 216M/850M [00:05<00:14, 45.0MB/s]
Downloading:  26%|██▌       | 220M/850M [00:05<00:14, 45.1MB/s]
Downloading:  26%|██▋       | 225M/850M [00:05<00:14, 45.9MB/s]
Downloading:  27%|██▋       | 229M/850M [00:06<00:14, 45.3MB/s]
Downloading:  27%|██▋       | 234M/850M [00:06<00:14, 45.5MB/s]
Downloading:  28%|██▊       | 238M/850M [00:06<00:14, 45.5MB/s]
Downloading:  29%|██▊       | 243M/850M [00:06<00:13, 46.5MB/s]
Downloading:  29%|██▉       | 247M/850M [00:06<00:13, 45.8MB/s]
Downloading:  30%|██▉       | 251M/850M [00:06<00:13, 45.7MB/s]
Downloading:  30%|███       | 256M/850M [00:06<00:13, 44.7MB/s]
Downloading:  31%|███       | 260M/850M [00:06<00:13, 45.0MB/s]
Downloading:  31%|███       | 265M/850M [00:06<00:13, 45.9MB/s]
Downloading:  32%|███▏      | 269M/850M [00:06<00:13, 46.5MB/s]
Downloading:  32%|███▏      | 274M/850M [00:07<00:13, 46.1MB/s]
Downloading:  33%|███▎      | 278M/850M [00:07<00:12, 46.5MB/s]
Downloading:  33%|███▎      | 283M/850M [00:07<00:12, 46.6MB/s]
Downloading:  34%|███▍      | 287M/850M [00:07<00:14, 42.0MB/s]
Downloading:  34%|███▍      | 291M/850M [00:07<00:13, 42.7MB/s]
Downloading:  35%|███▍      | 296M/850M [00:07<00:13, 43.3MB/s]
Downloading:  35%|███▌      | 300M/850M [00:07<00:13, 43.8MB/s]
Downloading:  36%|███▌      | 305M/850M [00:07<00:12, 44.8MB/s]
Downloading:  36%|███▋      | 309M/850M [00:07<00:12, 44.4MB/s]
Downloading:  37%|███▋      | 313M/850M [00:08<00:12, 44.7MB/s]
Downloading:  37%|███▋      | 317M/850M [00:08<00:12, 44.5MB/s]
Downloading:  38%|███▊      | 322M/850M [00:08<00:12, 45.7MB/s]
Downloading:  38%|███▊      | 326M/850M [00:08<00:12, 45.1MB/s]
Downloading:  39%|███▉      | 331M/850M [00:08<00:12, 44.9MB/s]
Downloading:  39%|███▉      | 335M/850M [00:08<00:12, 43.6MB/s]
Downloading:  40%|███▉      | 339M/850M [00:08<00:12, 43.9MB/s]
Downloading:  40%|████      | 344M/850M [00:08<00:11, 44.4MB/s]
Downloading:  41%|████      | 348M/850M [00:08<00:12, 43.7MB/s]
Downloading:  41%|████▏     | 352M/850M [00:08<00:12, 43.5MB/s]
Downloading:  42%|████▏     | 356M/850M [00:09<00:11, 43.4MB/s]
Downloading:  42%|████▏     | 360M/850M [00:09<00:11, 42.9MB/s]
Downloading:  43%|████▎     | 364M/850M [00:09<00:12, 39.2MB/s]
Downloading:  43%|████▎     | 368M/850M [00:09<00:13, 38.1MB/s]
Downloading:  44%|████▍     | 373M/850M [00:09<00:12, 40.1MB/s]
Downloading:  44%|████▍     | 376M/850M [00:09<00:14, 33.6MB/s]
Downloading:  45%|████▍     | 380M/850M [00:09<00:14, 33.5MB/s]
Downloading:  45%|████▌     | 384M/850M [00:09<00:13, 36.4MB/s]
Downloading:  46%|████▌     | 388M/850M [00:09<00:12, 38.0MB/s]
Downloading:  46%|████▌     | 392M/850M [00:10<00:11, 40.2MB/s]
Downloading:  47%|████▋     | 397M/850M [00:10<00:11, 42.6MB/s]
Downloading:  47%|████▋     | 401M/850M [00:10<00:10, 43.5MB/s]
Downloading:  48%|████▊     | 406M/850M [00:10<00:10, 43.3MB/s]
Downloading:  48%|████▊     | 410M/850M [00:10<00:10, 43.9MB/s]
Downloading:  49%|████▊     | 414M/850M [00:10<00:10, 43.0MB/s]
Downloading:  49%|████▉     | 418M/850M [00:10<00:10, 43.4MB/s]
Downloading:  50%|████▉     | 423M/850M [00:10<00:10, 43.8MB/s]
Downloading:  50%|█████     | 427M/850M [00:10<00:09, 44.9MB/s]
Downloading:  51%|█████     | 432M/850M [00:10<00:09, 45.1MB/s]
Downloading:  51%|█████▏    | 436M/850M [00:11<00:09, 44.8MB/s]
Downloading:  52%|█████▏    | 440M/850M [00:11<00:09, 45.2MB/s]
Downloading:  52%|█████▏    | 445M/850M [00:11<00:09, 42.9MB/s]
Downloading:  53%|█████▎    | 449M/850M [00:11<00:09, 42.3MB/s]
Downloading:  53%|█████▎    | 453M/850M [00:11<00:09, 41.8MB/s]
Downloading:  54%|█████▎    | 457M/850M [00:11<00:09, 42.7MB/s]
Downloading:  54%|█████▍    | 461M/850M [00:11<00:09, 42.3MB/s]
Downloading:  55%|█████▍    | 465M/850M [00:11<00:09, 41.9MB/s]
Downloading:  55%|█████▌    | 469M/850M [00:11<00:09, 40.3MB/s]
Downloading:  56%|█████▌    | 474M/850M [00:12<00:09, 41.9MB/s]
Downloading:  56%|█████▌    | 478M/850M [00:12<00:10, 38.1MB/s]
Downloading:  57%|█████▋    | 482M/850M [00:12<00:09, 39.1MB/s]
Downloading:  57%|█████▋    | 486M/850M [00:12<00:09, 41.1MB/s]
Downloading:  58%|█████▊    | 490M/850M [00:12<00:08, 42.4MB/s]
Downloading:  58%|█████▊    | 495M/850M [00:12<00:08, 44.1MB/s]
Downloading:  59%|█████▊    | 499M/850M [00:12<00:08, 44.8MB/s]
Downloading:  59%|█████▉    | 504M/850M [00:12<00:08, 42.6MB/s]
Downloading:  60%|█████▉    | 508M/850M [00:12<00:08, 44.2MB/s]
Downloading:  60%|██████    | 513M/850M [00:12<00:07, 44.4MB/s]
Downloading:  61%|██████    | 517M/850M [00:13<00:08, 43.6MB/s]
Downloading:  61%|██████▏   | 521M/850M [00:13<00:08, 41.9MB/s]
Downloading:  62%|██████▏   | 525M/850M [00:13<00:07, 43.3MB/s]
Downloading:  62%|██████▏   | 530M/850M [00:13<00:07, 43.9MB/s]
Downloading:  63%|██████▎   | 534M/850M [00:13<00:07, 44.5MB/s]
Downloading:  63%|██████▎   | 538M/850M [00:13<00:07, 43.8MB/s]
Downloading:  64%|██████▍   | 543M/850M [00:13<00:07, 43.4MB/s]
Downloading:  64%|██████▍   | 547M/850M [00:13<00:07, 43.8MB/s]
Downloading:  65%|██████▍   | 551M/850M [00:13<00:07, 44.1MB/s]
Downloading:  65%|██████▌   | 555M/850M [00:14<00:06, 44.3MB/s]
Downloading:  66%|██████▌   | 560M/850M [00:14<00:06, 44.5MB/s]
Downloading:  66%|██████▋   | 564M/850M [00:14<00:08, 34.8MB/s]
Downloading:  67%|██████▋   | 568M/850M [00:14<00:08, 35.4MB/s]
Downloading:  67%|██████▋   | 572M/850M [00:14<00:07, 37.1MB/s]
Downloading:  68%|██████▊   | 576M/850M [00:14<00:07, 39.7MB/s]
Downloading:  68%|██████▊   | 580M/850M [00:14<00:06, 40.6MB/s]
Downloading:  69%|██████▊   | 584M/850M [00:14<00:06, 41.9MB/s]
Downloading:  69%|██████▉   | 589M/850M [00:14<00:06, 41.3MB/s]
Downloading:  70%|██████▉   | 593M/850M [00:15<00:06, 41.5MB/s]
Downloading:  70%|███████   | 597M/850M [00:15<00:06, 40.2MB/s]
Downloading:  71%|███████   | 600M/850M [00:15<00:06, 40.4MB/s]
Downloading:  71%|███████   | 605M/850M [00:15<00:06, 41.2MB/s]
Downloading:  72%|███████▏  | 609M/850M [00:15<00:06, 41.7MB/s]
Downloading:  72%|███████▏  | 613M/850M [00:15<00:05, 42.1MB/s]
Downloading:  73%|███████▎  | 617M/850M [00:15<00:05, 43.9MB/s]
Downloading:  73%|███████▎  | 622M/850M [00:15<00:05, 44.6MB/s]
Downloading:  74%|███████▎  | 626M/850M [00:15<00:05, 43.8MB/s]
Downloading:  74%|███████▍  | 630M/850M [00:15<00:05, 44.7MB/s]
Downloading:  75%|███████▍  | 635M/850M [00:16<00:05, 44.0MB/s]
Downloading:  75%|███████▌  | 639M/850M [00:16<00:05, 44.2MB/s]
Downloading:  76%|███████▌  | 644M/850M [00:16<00:04, 45.8MB/s]
Downloading:  76%|███████▌  | 648M/850M [00:16<00:04, 46.6MB/s]
Downloading:  77%|███████▋  | 653M/850M [00:16<00:04, 45.9MB/s]
Downloading:  77%|███████▋  | 657M/850M [00:16<00:04, 46.8MB/s]
Downloading:  78%|███████▊  | 662M/850M [00:16<00:04, 47.0MB/s]
Downloading:  78%|███████▊  | 667M/850M [00:16<00:04, 46.0MB/s]
Downloading:  79%|███████▉  | 671M/850M [00:16<00:04, 43.7MB/s]
Downloading:  79%|███████▉  | 675M/850M [00:16<00:04, 43.8MB/s]
Downloading:  80%|███████▉  | 679M/850M [00:17<00:04, 44.1MB/s]
Downloading:  80%|████████  | 684M/850M [00:17<00:03, 43.9MB/s]
Downloading:  81%|████████  | 688M/850M [00:17<00:03, 45.3MB/s]
Downloading:  81%|████████▏ | 693M/850M [00:17<00:03, 45.3MB/s]
Downloading:  82%|████████▏ | 697M/850M [00:17<00:03, 44.0MB/s]
Downloading:  82%|████████▏ | 701M/850M [00:17<00:03, 45.2MB/s]
Downloading:  83%|████████▎ | 706M/850M [00:17<00:03, 45.6MB/s]
Downloading:  84%|████████▎ | 710M/850M [00:17<00:03, 45.7MB/s]
Downloading:  84%|████████▍ | 715M/850M [00:17<00:03, 46.5MB/s]
Downloading:  85%|████████▍ | 719M/850M [00:17<00:03, 45.2MB/s]
Downloading:  85%|████████▌ | 724M/850M [00:18<00:02, 44.9MB/s]
Downloading:  86%|████████▌ | 728M/850M [00:18<00:03, 41.5MB/s]
Downloading:  86%|████████▌ | 732M/850M [00:18<00:02, 42.6MB/s]
Downloading:  87%|████████▋ | 737M/850M [00:18<00:02, 43.8MB/s]
Downloading:  87%|████████▋ | 741M/850M [00:18<00:02, 43.4MB/s]
Downloading:  88%|████████▊ | 746M/850M [00:18<00:02, 44.5MB/s]
Downloading:  88%|████████▊ | 750M/850M [00:18<00:02, 41.5MB/s]
Downloading:  89%|████████▊ | 754M/850M [00:18<00:02, 42.8MB/s]
Downloading:  89%|████████▉ | 758M/850M [00:18<00:02, 43.3MB/s]
Downloading:  90%|████████▉ | 763M/850M [00:19<00:02, 44.2MB/s]
Downloading:  90%|█████████ | 767M/850M [00:19<00:01, 44.0MB/s]
Downloading:  91%|█████████ | 772M/850M [00:19<00:01, 45.1MB/s]
Downloading:  91%|█████████▏| 776M/850M [00:19<00:01, 45.3MB/s]
Downloading:  92%|█████████▏| 780M/850M [00:19<00:01, 45.6MB/s]
Downloading:  92%|█████████▏| 785M/850M [00:19<00:01, 45.7MB/s]
Downloading:  93%|█████████▎| 790M/850M [00:19<00:01, 46.7MB/s]
Downloading:  93%|█████████▎| 794M/850M [00:19<00:01, 47.7MB/s]
Downloading:  94%|█████████▍| 799M/850M [00:19<00:01, 46.7MB/s]
Downloading:  94%|█████████▍| 803M/850M [00:19<00:01, 46.5MB/s]
Downloading:  95%|█████████▌| 808M/850M [00:20<00:00, 47.2MB/s]
Downloading:  96%|█████████▌| 812M/850M [00:20<00:00, 45.6MB/s]
Downloading:  96%|█████████▌| 817M/850M [00:20<00:00, 46.0MB/s]
Downloading:  97%|█████████▋| 821M/850M [00:20<00:00, 46.2MB/s]
Downloading:  97%|█████████▋| 826M/850M [00:20<00:00, 45.6MB/s]
Downloading:  98%|█████████▊| 830M/850M [00:20<00:00, 43.5MB/s]
Downloading:  98%|█████████▊| 834M/850M [00:20<00:00, 38.2MB/s]
Downloading:  99%|█████████▊| 838M/850M [00:20<00:00, 28.4MB/s]
Downloading:  99%|█████████▉| 841M/850M [00:21<00:00, 28.4MB/s]
Downloading:  99%|█████████▉| 844M/850M [00:21<00:00, 26.6MB/s]
Downloading: 100%|█████████▉| 847M/850M [00:21<00:00, 25.2MB/s]
Downloading: 100%|██████████| 850M/850M [00:21<00:00, 27.8MB/s]
Downloading: 100%|██████████| 850M/850M [00:21<00:00, 41.5MB/s]
[2022-06-29 19:52:46] INFO [beir.generation.models.auto_model.__init__:16] Use pytorch device: cpu
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:40] Starting to Generate 1 Questions Per Passage using top-p (nucleus) sampling...
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:41] Params: top_p = 0.95
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:42] Params: top_k = 25
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:43] Params: max_length = 64
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:44] Params: ques_per_passage = 1
[2022-06-29 19:52:46] INFO [beir.generation.generate.generate:45] Params: batch size = 32

pas:   0%|          | 0/1 [00:00<?, ?it/s]
pas: 100%|██████████| 1/1 [00:18<00:00, 18.03s/it]
pas: 100%|██████████| 1/1 [00:18<00:00, 18.03s/it]
[2022-06-29 19:53:04] INFO [beir.generation.generate.generate:82] Saving 10 Generated Queries...
[2022-06-29 19:53:04] INFO [beir.generation.generate.save:23] Saving Generated Queries to generated/fiqa/qgen-queries.jsonl
[2022-06-29 19:53:04] INFO [beir.generation.generate.save:26] Saving Generated Qrels to generated/fiqa/qgen-qrels/train.tsv
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:67] Loading Corpus...

  0%|          | 0/10 [00:00<?, ?it/s]
100%|██████████| 10/10 [00:00<00:00, 11963.22it/s]
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:69] Loaded 10 TRAIN Documents.
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:70] Doc Example: {'text': "I'm not saying I don't like the idea of on-the-job training too, but you can't expect the company to do that. Training workers is not their job - they're building software. Perhaps educational systems in the U.S. (or their students) should worry a little about getting marketable skills in exchange for their massive investment in education, rather than getting out with thousands in student debt and then complaining that they aren't qualified to do anything.", 'title': ''}
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:73] Loading Queries...
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:79] Loaded 10 TRAIN Queries.
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:80] Query Example: can you train yourself
[2022-06-29 19:53:04] INFO [gpl.train.train:136] No hard-negative data found. Now mining it
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:67] Loading Corpus...

  0%|          | 0/10 [00:00<?, ?it/s]
100%|██████████| 10/10 [00:00<00:00, 76959.71it/s]
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:69] Loaded 10 TRAIN Documents.
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:70] Doc Example: {'text': "I'm not saying I don't like the idea of on-the-job training too, but you can't expect the company to do that. Training workers is not their job - they're building software. Perhaps educational systems in the U.S. (or their students) should worry a little about getting marketable skills in exchange for their massive investment in education, rather than getting out with thousands in student debt and then complaining that they aren't qualified to do anything.", 'title': ''}
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:73] Loading Queries...
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:79] Loaded 10 TRAIN Queries.
[2022-06-29 19:53:04] INFO [beir.datasets.data_loader.load:80] Query Example: can you train yourself
[2022-06-29 19:53:04] WARNING [gpl.toolkit.mine.__init__:42] `negatives_per_query` > corpus size. Please use a smaller `negatives_per_query`
[2022-06-29 19:53:04] INFO [gpl.toolkit.mine._mine_sbert:49] Mining with msmarco-distilbert-base-v3
[2022-06-29 19:53:04] INFO [sentence_transformers.SentenceTransformer.__init__:60] Load pretrained SentenceTransformer: msmarco-distilbert-base-v3

Downloading:   0%|          | 0.00/690 [00:00<?, ?B/s]
Downloading: 100%|██████████| 690/690 [00:00<00:00, 305kB/s]

Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]
Downloading: 100%|██████████| 190/190 [00:00<00:00, 176kB/s]

Downloading:   0%|          | 0.00/3.71k [00:00<?, ?B/s]
Downloading: 100%|██████████| 3.71k/3.71k [00:00<00:00, 2.23MB/s]

Downloading:   0%|          | 0.00/545 [00:00<?, ?B/s]
Downloading: 100%|██████████| 545/545 [00:00<00:00, 352kB/s]

Downloading:   0%|          | 0.00/122 [00:00<?, ?B/s]
Downloading: 100%|██████████| 122/122 [00:00<00:00, 96.9kB/s]

Downloading:   0%|          | 0.00/229 [00:00<?, ?B/s]
Downloading: 100%|██████████| 229/229 [00:00<00:00, 136kB/s]

Downloading:   0%|          | 0.00/265M [00:00<?, ?B/s]
Downloading:   1%|          | 1.96M/265M [00:00<00:13, 19.6MB/s]
Downloading:   3%|▎         | 6.75M/265M [00:00<00:07, 36.2MB/s]
Downloading:   4%|▍         | 10.8M/265M [00:00<00:06, 38.4MB/s]
Downloading:   6%|▌         | 15.6M/265M [00:00<00:05, 41.9MB/s]
Downloading:   8%|▊         | 20.2M/265M [00:00<00:05, 43.4MB/s]
Downloading:   9%|▉         | 24.9M/265M [00:00<00:05, 44.5MB/s]
Downloading:  11%|█         | 29.3M/265M [00:00<00:05, 42.6MB/s]
Downloading:  13%|█▎        | 34.0M/265M [00:00<00:05, 43.9MB/s]
Downloading:  15%|█▍        | 38.7M/265M [00:00<00:05, 44.9MB/s]
Downloading:  16%|█▋        | 43.4M/265M [00:01<00:04, 45.7MB/s]
Downloading:  18%|█▊        | 48.0M/265M [00:01<00:04, 45.0MB/s]
Downloading:  20%|█▉        | 52.5M/265M [00:01<00:04, 43.2MB/s]
Downloading:  21%|██▏       | 57.0M/265M [00:01<00:04, 43.6MB/s]
Downloading:  23%|██▎       | 61.5M/265M [00:01<00:04, 44.0MB/s]
Downloading:  25%|██▍       | 66.1M/265M [00:01<00:04, 44.5MB/s]
Downloading:  27%|██▋       | 70.8M/265M [00:01<00:04, 45.4MB/s]
Downloading:  28%|██▊       | 75.4M/265M [00:01<00:04, 45.6MB/s]
Downloading:  30%|███       | 80.0M/265M [00:01<00:04, 45.2MB/s]
Downloading:  32%|███▏      | 84.5M/265M [00:01<00:04, 42.9MB/s]
Downloading:  33%|███▎      | 88.8M/265M [00:02<00:04, 42.5MB/s]
Downloading:  35%|███▌      | 93.3M/265M [00:02<00:03, 43.1MB/s]
Downloading:  37%|███▋      | 97.9M/265M [00:02<00:03, 44.0MB/s]
Downloading:  39%|███▊      | 102M/265M [00:02<00:03, 44.4MB/s] 
Downloading:  40%|████      | 107M/265M [00:02<00:03, 43.9MB/s]
Downloading:  42%|████▏     | 112M/265M [00:02<00:03, 44.7MB/s]
Downloading:  44%|████▎     | 116M/265M [00:02<00:03, 44.4MB/s]
Downloading:  45%|████▌     | 121M/265M [00:02<00:03, 45.2MB/s]
Downloading:  47%|████▋     | 125M/265M [00:02<00:03, 45.5MB/s]
Downloading:  49%|████▉     | 130M/265M [00:02<00:02, 45.4MB/s]
Downloading:  51%|█████     | 135M/265M [00:03<00:02, 45.6MB/s]
Downloading:  52%|█████▏    | 139M/265M [00:03<00:02, 44.7MB/s]
Downloading:  54%|█████▍    | 144M/265M [00:03<00:02, 44.8MB/s]
Downloading:  56%|█████▌    | 148M/265M [00:03<00:02, 44.5MB/s]
Downloading:  57%|█████▋    | 153M/265M [00:03<00:02, 44.2MB/s]
Downloading:  59%|█████▉    | 157M/265M [00:03<00:02, 42.1MB/s]
Downloading:  61%|██████    | 161M/265M [00:03<00:02, 42.7MB/s]
Downloading:  62%|██████▏   | 166M/265M [00:03<00:02, 42.8MB/s]
Downloading:  64%|██████▍   | 170M/265M [00:03<00:02, 42.3MB/s]
Downloading:  66%|██████▌   | 175M/265M [00:03<00:02, 43.1MB/s]
Downloading:  67%|██████▋   | 179M/265M [00:04<00:01, 43.9MB/s]
Downloading:  69%|██████▉   | 184M/265M [00:04<00:01, 43.8MB/s]
Downloading:  71%|███████   | 188M/265M [00:04<00:01, 44.7MB/s]
Downloading:  73%|███████▎  | 193M/265M [00:04<00:01, 45.4MB/s]
Downloading:  74%|███████▍  | 197M/265M [00:04<00:01, 45.3MB/s]
Downloading:  76%|███████▌  | 202M/265M [00:04<00:01, 45.6MB/s]
Downloading:  78%|███████▊  | 207M/265M [00:04<00:01, 45.4MB/s]
Downloading:  80%|███████▉  | 211M/265M [00:04<00:01, 45.0MB/s]
Downloading:  81%|████████▏ | 216M/265M [00:04<00:01, 44.3MB/s]
Downloading:  83%|████████▎ | 220M/265M [00:05<00:01, 44.7MB/s]
Downloading:  85%|████████▍ | 225M/265M [00:05<00:00, 45.3MB/s]
Downloading:  86%|████████▋ | 230M/265M [00:05<00:00, 45.1MB/s]
Downloading:  88%|████████▊ | 234M/265M [00:05<00:00, 43.1MB/s]
Downloading:  90%|████████▉ | 239M/265M [00:05<00:00, 44.0MB/s]
Downloading:  92%|█████████▏| 243M/265M [00:05<00:00, 44.3MB/s]
Downloading:  93%|█████████▎| 248M/265M [00:05<00:00, 43.9MB/s]
Downloading:  95%|█████████▌| 252M/265M [00:05<00:00, 44.8MB/s]
Downloading:  97%|█████████▋| 257M/265M [00:05<00:00, 45.7MB/s]
Downloading:  99%|█████████▊| 262M/265M [00:05<00:00, 45.6MB/s]
Downloading: 100%|██████████| 265M/265M [00:06<00:00, 44.1MB/s]

Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]
Downloading: 100%|██████████| 53.0/53.0 [00:00<00:00, 42.0kB/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 83.8kB/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]
Downloading:  18%|█▊        | 86.0k/466k [00:00<00:00, 751kB/s]
Downloading: 100%|██████████| 466k/466k [00:00<00:00, 2.36MB/s]

Downloading:   0%|          | 0.00/499 [00:00<?, ?B/s]
Downloading: 100%|██████████| 499/499 [00:00<00:00, 442kB/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]
Downloading:  38%|███▊      | 87.0k/232k [00:00<00:00, 715kB/s]
Downloading: 100%|██████████| 232k/232k [00:00<00:00, 1.41MB/s]
[2022-06-29 19:53:16] INFO [sentence_transformers.SentenceTransformer.__init__:97] Use pytorch device: cpu

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:04<00:00,  4.63s/it]
Batches: 100%|██████████| 1/1 [00:04<00:00,  4.63s/it]

  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00,  4.05it/s]
100%|██████████| 1/1 [00:00<00:00,  4.04it/s]
[2022-06-29 19:53:21] INFO [gpl.toolkit.mine._mine_sbert:49] Mining with msmarco-MiniLM-L-6-v3
[2022-06-29 19:53:21] INFO [sentence_transformers.SentenceTransformer.__init__:60] Load pretrained SentenceTransformer: msmarco-MiniLM-L-6-v3

Downloading:   0%|          | 0.00/736 [00:00<?, ?B/s]
Downloading: 100%|██████████| 736/736 [00:00<00:00, 457kB/s]

Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]
Downloading: 100%|██████████| 190/190 [00:00<00:00, 164kB/s]

Downloading:   0%|          | 0.00/3.68k [00:00<?, ?B/s]
Downloading: 100%|██████████| 3.68k/3.68k [00:00<00:00, 2.15MB/s]

Downloading:   0%|          | 0.00/627 [00:00<?, ?B/s]
Downloading: 100%|██████████| 627/627 [00:00<00:00, 430kB/s]

Downloading:   0%|          | 0.00/122 [00:00<?, ?B/s]
Downloading: 100%|██████████| 122/122 [00:00<00:00, 80.3kB/s]

Downloading:   0%|          | 0.00/229 [00:00<?, ?B/s]
Downloading: 100%|██████████| 229/229 [00:00<00:00, 182kB/s]

Downloading:   0%|          | 0.00/90.9M [00:00<?, ?B/s]
Downloading:   2%|▏         | 1.62M/90.9M [00:00<00:05, 16.2MB/s]
Downloading:   6%|▌         | 5.21M/90.9M [00:00<00:03, 27.8MB/s]
Downloading:  10%|█         | 9.52M/90.9M [00:00<00:02, 34.8MB/s]
Downloading:  15%|█▌        | 13.9M/90.9M [00:00<00:02, 38.4MB/s]
Downloading:  20%|█▉        | 17.8M/90.9M [00:00<00:01, 38.2MB/s]
Downloading:  24%|██▍       | 22.1M/90.9M [00:00<00:01, 39.8MB/s]
Downloading:  29%|██▉       | 26.6M/90.9M [00:00<00:01, 41.5MB/s]
Downloading:  34%|███▍      | 31.1M/90.9M [00:00<00:01, 42.9MB/s]
Downloading:  39%|███▉      | 35.9M/90.9M [00:00<00:01, 44.2MB/s]
Downloading:  45%|████▍     | 40.5M/90.9M [00:01<00:01, 44.8MB/s]
Downloading:  49%|████▉     | 45.0M/90.9M [00:01<00:01, 44.8MB/s]
Downloading:  55%|█████▍    | 49.6M/90.9M [00:01<00:00, 45.4MB/s]
Downloading:  60%|█████▉    | 54.3M/90.9M [00:01<00:00, 45.7MB/s]
Downloading:  65%|██████▍   | 58.9M/90.9M [00:01<00:00, 45.3MB/s]
Downloading:  70%|██████▉   | 63.4M/90.9M [00:01<00:00, 44.5MB/s]
Downloading:  75%|███████▌  | 68.2M/90.9M [00:01<00:00, 45.7MB/s]
Downloading:  80%|████████  | 73.0M/90.9M [00:01<00:00, 46.4MB/s]
Downloading:  85%|████████▌ | 77.7M/90.9M [00:01<00:00, 46.4MB/s]
Downloading:  91%|█████████ | 82.3M/90.9M [00:01<00:00, 45.5MB/s]
Downloading:  96%|█████████▌| 86.9M/90.9M [00:02<00:00, 44.8MB/s]
Downloading: 100%|██████████| 90.9M/90.9M [00:02<00:00, 43.2MB/s]

Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]
Downloading: 100%|██████████| 53.0/53.0 [00:00<00:00, 46.7kB/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 85.4kB/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]
Downloading:  18%|█▊        | 81.9k/466k [00:00<00:00, 701kB/s]
Downloading: 100%|██████████| 466k/466k [00:00<00:00, 2.34MB/s]

Downloading:   0%|          | 0.00/430 [00:00<?, ?B/s]
Downloading: 100%|██████████| 430/430 [00:00<00:00, 305kB/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]
Downloading:  32%|███▏      | 74.8k/232k [00:00<00:00, 637kB/s]
Downloading: 100%|██████████| 232k/232k [00:00<00:00, 1.45MB/s]
[2022-06-29 19:53:28] INFO [sentence_transformers.SentenceTransformer.__init__:97] Use pytorch device: cpu

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:01<00:00,  1.56s/it]
Batches: 100%|██████████| 1/1 [00:01<00:00,  1.56s/it]

  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 12.29it/s]
[2022-06-29 19:53:30] INFO [gpl.toolkit.mine.run:114] Combining all the data

  0%|          | 0/10 [00:00<?, ?it/s]
100%|██████████| 10/10 [00:00<00:00, 177724.75it/s]
[2022-06-29 19:53:30] INFO [gpl.toolkit.mine.run:126] Saving data to generated/fiqa/hard-negatives.jsonl
[2022-06-29 19:53:30] INFO [gpl.toolkit.mine.run:130] Done
[2022-06-29 19:53:30] INFO [gpl.train.train:147] No GPL-training data found. Now generating it via pseudo labeling

Downloading:   0%|          | 0.00/794 [00:00<?, ?B/s]
Downloading: 100%|██████████| 794/794 [00:00<00:00, 500kB/s]

Downloading:   0%|          | 0.00/86.7M [00:00<?, ?B/s]
Downloading:   2%|▏         | 1.68M/86.7M [00:00<00:05, 17.5MB/s]
Downloading:   6%|▌         | 5.39M/86.7M [00:00<00:02, 30.0MB/s]
Downloading:  10%|▉         | 8.26M/86.7M [00:00<00:02, 30.0MB/s]
Downloading:  13%|█▎        | 11.1M/86.7M [00:00<00:02, 26.5MB/s]
Downloading:  17%|█▋        | 14.6M/86.7M [00:00<00:02, 30.0MB/s]
Downloading:  21%|██▏       | 18.5M/86.7M [00:00<00:02, 33.4MB/s]
Downloading:  26%|██▌       | 22.5M/86.7M [00:00<00:01, 36.1MB/s]
Downloading:  31%|███       | 26.6M/86.7M [00:00<00:01, 38.2MB/s]
Downloading:  36%|███▌      | 30.8M/86.7M [00:00<00:01, 40.0MB/s]
Downloading:  40%|████      | 34.8M/86.7M [00:01<00:01, 40.6MB/s]
Downloading:  45%|████▍     | 39.0M/86.7M [00:01<00:01, 41.6MB/s]
Downloading:  50%|████▉     | 43.1M/86.7M [00:01<00:01, 41.9MB/s]
Downloading:  55%|█████▍    | 47.5M/86.7M [00:01<00:00, 43.3MB/s]
Downloading:  60%|██████    | 52.2M/86.7M [00:01<00:00, 45.1MB/s]
Downloading:  65%|██████▌   | 56.6M/86.7M [00:01<00:00, 45.5MB/s]
Downloading:  70%|███████   | 61.1M/86.7M [00:01<00:00, 45.9MB/s]
Downloading:  76%|███████▌  | 65.6M/86.7M [00:01<00:00, 46.3MB/s]
Downloading:  81%|████████  | 70.1M/86.7M [00:01<00:00, 46.4MB/s]
Downloading:  86%|████████▌ | 74.6M/86.7M [00:01<00:00, 46.8MB/s]
Downloading:  91%|█████████▏| 79.1M/86.7M [00:02<00:00, 47.0MB/s]
Downloading:  96%|█████████▋| 83.6M/86.7M [00:02<00:00, 45.4MB/s]
Downloading: 100%|██████████| 86.7M/86.7M [00:02<00:00, 41.1MB/s]

Downloading:   0%|          | 0.00/316 [00:00<?, ?B/s]
Downloading: 100%|██████████| 316/316 [00:00<00:00, 270kB/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]
Downloading:  77%|███████▋  | 173k/226k [00:00<00:00, 1.48MB/s]
Downloading: 100%|██████████| 226k/226k [00:00<00:00, 1.89MB/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 71.6kB/s]
[2022-06-29 19:53:37] INFO [sentence_transformers.cross_encoder.CrossEncoder.__init__:55] Use pytorch device: cpu
[2022-06-29 19:53:39] INFO [gpl.toolkit.pl.run:60] Begin pseudo labeling

  0%|          | 0/100 [00:00<?, ?it/s]
  1%|          | 1/100 [00:01<02:17,  1.39s/it]
  2%|▏         | 2/100 [00:02<02:08,  1.31s/it]
  3%|▎         | 3/100 [00:03<02:05,  1.29s/it]
  4%|▍         | 4/100 [00:04<01:35,  1.00it/s]
  5%|▌         | 5/100 [00:05<01:46,  1.12s/it]
  6%|▌         | 6/100 [00:07<01:50,  1.18s/it]
  7%|▋         | 7/100 [00:08<01:52,  1.21s/it]
  8%|▊         | 8/100 [00:09<01:53,  1.23s/it]
  9%|▉         | 9/100 [00:10<01:53,  1.25s/it]
 10%|█         | 10/100 [00:12<01:53,  1.26s/it]
 11%|█         | 11/100 [00:13<01:50,  1.25s/it]
 12%|█▏        | 12/100 [00:14<01:42,  1.16s/it]
 13%|█▎        | 13/100 [00:15<01:45,  1.21s/it]
 14%|█▍        | 14/100 [00:16<01:45,  1.23s/it]
 15%|█▌        | 15/100 [00:18<01:45,  1.25s/it]
 16%|█▌        | 16/100 [00:19<01:45,  1.25s/it]
 17%|█▋        | 17/100 [00:20<01:45,  1.27s/it]
 18%|█▊        | 18/100 [00:22<01:42,  1.24s/it]
 19%|█▉        | 19/100 [00:23<01:41,  1.25s/it]
 20%|██        | 20/100 [00:24<01:40,  1.25s/it]
 21%|██        | 21/100 [00:25<01:39,  1.26s/it]
 22%|██▏       | 22/100 [00:27<01:38,  1.26s/it]
 23%|██▎       | 23/100 [00:28<01:36,  1.25s/it]
 24%|██▍       | 24/100 [00:29<01:34,  1.25s/it]
 25%|██▌       | 25/100 [00:30<01:36,  1.28s/it]
 26%|██▌       | 26/100 [00:32<01:35,  1.29s/it]
 27%|██▋       | 27/100 [00:33<01:34,  1.29s/it]
 28%|██▊       | 28/100 [00:34<01:35,  1.32s/it]
 29%|██▉       | 29/100 [00:36<01:31,  1.29s/it]
 30%|███       | 30/100 [00:37<01:32,  1.32s/it]
 31%|███       | 31/100 [00:38<01:15,  1.10s/it]
 32%|███▏      | 32/100 [00:39<01:16,  1.13s/it]
 33%|███▎      | 33/100 [00:40<01:17,  1.16s/it]
 34%|███▍      | 34/100 [00:41<01:17,  1.17s/it]
 35%|███▌      | 35/100 [00:42<01:17,  1.19s/it]
 36%|███▌      | 36/100 [00:43<01:11,  1.12s/it]
 37%|███▋      | 37/100 [00:44<01:00,  1.05it/s]
 38%|███▊      | 38/100 [00:45<01:04,  1.04s/it]
 39%|███▉      | 39/100 [00:47<01:08,  1.11s/it]
 40%|████      | 40/100 [00:48<01:04,  1.08s/it]
 41%|████      | 41/100 [00:49<01:06,  1.13s/it]
 42%|████▏     | 42/100 [00:50<01:09,  1.20s/it]
 43%|████▎     | 43/100 [00:51<01:09,  1.22s/it]
 44%|████▍     | 44/100 [00:53<01:10,  1.25s/it]
 45%|████▌     | 45/100 [00:54<01:08,  1.24s/it]
 46%|████▌     | 46/100 [00:55<01:02,  1.16s/it]
 47%|████▋     | 47/100 [00:56<00:58,  1.10s/it]
 48%|████▊     | 48/100 [00:57<00:59,  1.15s/it]
 49%|████▉     | 49/100 [00:58<01:00,  1.18s/it]
 50%|█████     | 50/100 [01:00<01:00,  1.21s/it]
 51%|█████     | 51/100 [01:01<00:55,  1.14s/it]
 52%|█████▏    | 52/100 [01:02<00:55,  1.16s/it]
 53%|█████▎    | 53/100 [01:03<00:55,  1.19s/it]
 54%|█████▍    | 54/100 [01:04<00:51,  1.11s/it]
 55%|█████▌    | 55/100 [01:05<00:52,  1.18s/it]
 56%|█████▌    | 56/100 [01:07<00:52,  1.18s/it]
 57%|█████▋    | 57/100 [01:08<00:51,  1.19s/it]
 58%|█████▊    | 58/100 [01:09<00:51,  1.22s/it]
 59%|█████▉    | 59/100 [01:10<00:50,  1.23s/it]
 60%|██████    | 60/100 [01:12<00:49,  1.24s/it]
 61%|██████    | 61/100 [01:13<00:48,  1.25s/it]
 62%|██████▏   | 62/100 [01:14<00:47,  1.25s/it]
 63%|██████▎   | 63/100 [01:15<00:47,  1.27s/it]
 64%|██████▍   | 64/100 [01:17<00:45,  1.27s/it]
 65%|██████▌   | 65/100 [01:18<00:44,  1.27s/it]
 66%|██████▌   | 66/100 [01:19<00:42,  1.25s/it]
 67%|██████▋   | 67/100 [01:20<00:41,  1.27s/it]
 68%|██████▊   | 68/100 [01:22<00:41,  1.29s/it]
 69%|██████▉   | 69/100 [01:23<00:39,  1.27s/it]
 70%|███████   | 70/100 [01:24<00:37,  1.26s/it]
 71%|███████   | 71/100 [01:26<00:36,  1.26s/it]
 72%|███████▏  | 72/100 [01:27<00:35,  1.26s/it]
 73%|███████▎  | 73/100 [01:28<00:34,  1.26s/it]
 74%|███████▍  | 74/100 [01:29<00:30,  1.17s/it]
 75%|███████▌  | 75/100 [01:30<00:30,  1.21s/it]
 76%|███████▌  | 76/100 [01:32<00:29,  1.21s/it]
 77%|███████▋  | 77/100 [01:33<00:28,  1.23s/it]
 78%|███████▊  | 78/100 [01:34<00:27,  1.24s/it]
 79%|███████▉  | 79/100 [01:35<00:26,  1.26s/it]
 80%|████████  | 80/100 [01:37<00:25,  1.25s/it]
 81%|████████  | 81/100 [01:38<00:23,  1.26s/it]
 82%|████████▏ | 82/100 [01:39<00:22,  1.26s/it]
 83%|████████▎ | 83/100 [01:40<00:21,  1.27s/it]
 84%|████████▍ | 84/100 [01:42<00:20,  1.27s/it]
 85%|████████▌ | 85/100 [01:43<00:19,  1.27s/it]
 86%|████████▌ | 86/100 [01:44<00:17,  1.26s/it]
 87%|████████▋ | 87/100 [01:45<00:16,  1.25s/it]
 88%|████████▊ | 88/100 [01:47<00:15,  1.26s/it]
 89%|████████▉ | 89/100 [01:48<00:12,  1.18s/it]
 90%|█████████ | 90/100 [01:49<00:12,  1.21s/it]
 91%|█████████ | 91/100 [01:50<00:11,  1.23s/it]
 92%|█████████▏| 92/100 [01:52<00:09,  1.24s/it]
 93%|█████████▎| 93/100 [01:53<00:09,  1.29s/it]
 94%|█████████▍| 94/100 [01:53<00:06,  1.05s/it]
 95%|█████████▌| 95/100 [01:55<00:05,  1.13s/it]
 96%|█████████▌| 96/100 [01:56<00:04,  1.16s/it]
 97%|█████████▋| 97/100 [01:57<00:03,  1.18s/it]
 98%|█████████▊| 98/100 [01:59<00:02,  1.20s/it]
 99%|█████████▉| 99/100 [02:00<00:01,  1.22s/it]
100%|██████████| 100/100 [02:01<00:00,  1.25s/it]
100%|██████████| 100/100 [02:01<00:00,  1.22s/it]
[2022-06-29 19:55:41] INFO [gpl.toolkit.pl.run:80] Done pseudo labeling and saving data
[2022-06-29 19:55:41] INFO [gpl.toolkit.pl.run:84] Saved GPL-training data to generated/fiqa/gpl-training-data.tsv
[2022-06-29 19:55:41] INFO [gpl.train.train:168] Now doing training on the generated data with the MarginMSE loss
[2022-06-29 19:55:41] INFO [sentence_transformers.SentenceTransformer.__init__:60] Load pretrained SentenceTransformer: distilbert-base-uncased

Downloading:   0%|          | 0.00/391 [00:00<?, ?B/s]
Downloading: 100%|██████████| 391/391 [00:00<00:00, 327kB/s]

Downloading:   0%|          | 0.00/11.4k [00:00<?, ?B/s]
Downloading: 100%|██████████| 11.4k/11.4k [00:00<00:00, 8.79MB/s]

Downloading:   0%|          | 0.00/8.56k [00:00<?, ?B/s]
Downloading: 100%|██████████| 8.56k/8.56k [00:00<00:00, 6.58MB/s]

Downloading:   0%|          | 0.00/483 [00:00<?, ?B/s]
Downloading: 100%|██████████| 483/483 [00:00<00:00, 433kB/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]
Downloading:   1%|          | 2.76M/268M [00:00<00:09, 27.6MB/s]
Downloading:   3%|▎         | 7.71M/268M [00:00<00:06, 40.5MB/s]
Downloading:   5%|▍         | 12.5M/268M [00:00<00:05, 43.7MB/s]
Downloading:   6%|▋         | 17.0M/268M [00:00<00:05, 44.5MB/s]
Downloading:   8%|▊         | 21.8M/268M [00:00<00:05, 45.5MB/s]
Downloading:  10%|▉         | 26.3M/268M [00:00<00:05, 44.9MB/s]
Downloading:  12%|█▏        | 30.8M/268M [00:00<00:05, 45.0MB/s]
Downloading:  13%|█▎        | 35.7M/268M [00:00<00:05, 46.2MB/s]
Downloading:  15%|█▌        | 40.7M/268M [00:00<00:04, 47.2MB/s]
Downloading:  17%|█▋        | 45.4M/268M [00:01<00:05, 39.8MB/s]
Downloading:  19%|█▊        | 49.9M/268M [00:01<00:05, 41.1MB/s]
Downloading:  20%|██        | 54.7M/268M [00:01<00:04, 43.2MB/s]
Downloading:  22%|██▏       | 59.3M/268M [00:01<00:04, 44.0MB/s]
Downloading:  24%|██▍       | 64.2M/268M [00:01<00:04, 45.4MB/s]
Downloading:  26%|██▌       | 68.9M/268M [00:01<00:04, 45.9MB/s]
Downloading:  27%|██▋       | 73.6M/268M [00:01<00:04, 44.9MB/s]
Downloading:  29%|██▉       | 78.1M/268M [00:01<00:04, 44.6MB/s]
Downloading:  31%|███       | 82.7M/268M [00:01<00:04, 45.1MB/s]
Downloading:  33%|███▎      | 87.6M/268M [00:01<00:03, 46.1MB/s]
Downloading:  34%|███▍      | 92.2M/268M [00:02<00:03, 45.9MB/s]
Downloading:  36%|███▌      | 96.9M/268M [00:02<00:03, 46.2MB/s]
Downloading:  38%|███▊      | 101M/268M [00:02<00:03, 46.1MB/s] 
Downloading:  40%|███▉      | 106M/268M [00:02<00:03, 44.0MB/s]
Downloading:  41%|████▏     | 111M/268M [00:02<00:03, 45.3MB/s]
Downloading:  43%|████▎     | 116M/268M [00:02<00:03, 45.4MB/s]
Downloading:  45%|████▍     | 120M/268M [00:02<00:03, 44.4MB/s]
Downloading:  47%|████▋     | 125M/268M [00:02<00:03, 45.3MB/s]
Downloading:  48%|████▊     | 129M/268M [00:02<00:03, 42.8MB/s]
Downloading:  50%|█████     | 134M/268M [00:03<00:03, 43.9MB/s]
Downloading:  52%|█████▏    | 139M/268M [00:03<00:02, 45.1MB/s]
Downloading:  54%|█████▎    | 143M/268M [00:03<00:02, 44.7MB/s]
Downloading:  55%|█████▌    | 148M/268M [00:03<00:02, 44.3MB/s]
Downloading:  57%|█████▋    | 152M/268M [00:03<00:02, 43.9MB/s]
Downloading:  59%|█████▊    | 157M/268M [00:03<00:02, 44.6MB/s]
Downloading:  60%|██████    | 162M/268M [00:03<00:02, 45.0MB/s]
Downloading:  62%|██████▏   | 166M/268M [00:03<00:02, 45.5MB/s]
Downloading:  64%|██████▎   | 171M/268M [00:03<00:02, 45.4MB/s]
Downloading:  65%|██████▌   | 175M/268M [00:03<00:02, 46.0MB/s]
Downloading:  67%|██████▋   | 180M/268M [00:04<00:01, 46.0MB/s]
Downloading:  69%|██████▉   | 185M/268M [00:04<00:01, 46.4MB/s]
Downloading:  71%|███████   | 189M/268M [00:04<00:01, 46.2MB/s]
Downloading:  72%|███████▏  | 194M/268M [00:04<00:01, 40.4MB/s]
Downloading:  74%|███████▍  | 198M/268M [00:04<00:01, 41.1MB/s]
Downloading:  76%|███████▌  | 203M/268M [00:04<00:01, 40.8MB/s]
Downloading:  77%|███████▋  | 207M/268M [00:04<00:01, 41.4MB/s]
Downloading:  79%|███████▉  | 211M/268M [00:04<00:01, 42.1MB/s]
Downloading:  81%|████████  | 216M/268M [00:04<00:01, 43.2MB/s]
Downloading:  82%|████████▏ | 220M/268M [00:04<00:01, 43.3MB/s]
Downloading:  84%|████████▍ | 225M/268M [00:05<00:01, 43.0MB/s]
Downloading:  86%|████████▌ | 229M/268M [00:05<00:00, 43.9MB/s]
Downloading:  87%|████████▋ | 234M/268M [00:05<00:00, 43.8MB/s]
Downloading:  89%|████████▉ | 238M/268M [00:05<00:00, 43.5MB/s]
Downloading:  90%|█████████ | 242M/268M [00:05<00:00, 43.9MB/s]
Downloading:  92%|█████████▏| 247M/268M [00:05<00:00, 44.2MB/s]
Downloading:  94%|█████████▍| 251M/268M [00:05<00:00, 44.3MB/s]
Downloading:  96%|█████████▌| 256M/268M [00:05<00:00, 45.2MB/s]
Downloading:  97%|█████████▋| 261M/268M [00:05<00:00, 45.3MB/s]
Downloading:  99%|█████████▉| 265M/268M [00:05<00:00, 45.0MB/s]
Downloading: 100%|██████████| 268M/268M [00:06<00:00, 44.2MB/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]
Downloading:  38%|███▊      | 177k/466k [00:00<00:00, 1.64MB/s]
Downloading: 100%|██████████| 466k/466k [00:00<00:00, 3.14MB/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]
Downloading: 100%|██████████| 28.0/28.0 [00:00<00:00, 22.8kB/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]
Downloading:  33%|███▎      | 76.8k/232k [00:00<00:00, 629kB/s]
Downloading: 100%|██████████| 232k/232k [00:00<00:00, 1.41MB/s]
[2022-06-29 19:55:50] WARNING [root._load_auto_model:789] No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/distilbert-base-uncased. Creating a new one with MEAN pooling.
Some weights of the model checkpoint at /root/.cache/torch/sentence_transformers/distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_projector.weight', 'vocab_layer_norm.weight', 'vocab_transform.weight', 'vocab_transform.bias', 'vocab_layer_norm.bias', 'vocab_projector.bias']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2022-06-29 19:55:51] INFO [sentence_transformers.SentenceTransformer.__init__:97] Use pytorch device: cpu

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 21.75it/s]
[2022-06-29 19:55:51] INFO [gpl.toolkit.sbert.load_sbert:44] Set max_seq_length=350
[2022-06-29 19:55:51] INFO [gpl.train.train:173] Load GPL training data from generated/fiqa/gpl-training-data.tsv
[2022-06-29 19:55:51] INFO [gpl.toolkit.loss.__init__:22] Set GPL score function to dot

Epoch:   0%|          | 0/1 [00:00<?, ?it/s]

Iteration:   0%|          | 0/100 [00:00<?, ?it/s]