Open SergeyPetrakov opened 2 years ago
Hi Sergey! Thanks for raising the issue! Which model checkpoint and embedding path are you using? Are these ones you trained or one of the four listed on the README?
Hi! I followed your instructions written in README.
My step were the following (as stated in your README):
1) Made new conda environment with python=3.7
2) clone repo
3) pip install -r requirements.txt
4) pip install -e .
5) pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/torch_stable.html
6) download best_model.pth, embs.npy, entity.jsonl, entity.pkl
7) tried to launch TAbi interactively using python scripts/demo.py --model_checkpoint best_model.pth --entity_emb_path embs.npy --entity_file entity.pkl
After that I received above mentioned error
Hi Sergey, apologies for the slow reply. Can you check the md5 of the embedding file? These are the md5s for each of the embedding files (embs.npy):
If the md5 doesn't match one of the above, you may need to re-download the file. Please comment if that still doesn't work, thanks!
Hi Megan! No problem, thank you for md5 that you sent. I checked it, now it works as described in your repo. Btw, do I correctly understand that your model focuses only on input written in English? Can it be finetuned on the multilingual datasets to retrieve entities from other languages?
(tabi) petrakov@nlp2:~/tabi$ python3 scripts/demo.py --model_checkpoint best_model.pth --entity_emb_path embs.npy --entity_file entity.pkl 2022-09-20 12:13:09,194 [INFO] Loading model... 2022-09-20 12:13:09,194 [INFO] Using encoder model: bert-base-uncased Traceback (most recent call last): File "scripts/demo.py", line 67, in
model = Biencoder(
File "/home/petrakov/tabi/tabi/models/biencoder.py", line 47, in init
entity_embs = np.memmap(entity_emb_path, dtype="float32", mode="r").reshape(
ValueError: cannot reshape array of size 1950552064 into shape (768)
May be you can help me with it