I tried using a quantized model for biencoder in fast model as follows:
with open(args.biencoder_config) as json_file:
biencoder_params = json.load(json_file)
biencoder_params["path_to_model"] = args.biencoder_model
biencoder = load_biencoder(biencoder_params)
# Quantize model
biencoder = torch.quantization.quantize_dynamic(
biencoder, {torch.nn.Linear}, dtype=torch.qint8
)
# Save quantized model for later use
quantized_biencoder_model = args.biencoder_model.replace(
".bin", "_quantized.bin"
)
torch.save(biencoder.state_dict(), quantized_biencoder_model)
The resulting model size is significantly smaller on disk:
$ du -h biencoder_wiki_large*.bin
2.5G biencoder_wiki_large.bin
824M biencoder_wiki_large_quantized.bin
I fed the text from the following entity to the regular as well as the quantized model and I get slightly different scores for detected mentions:
{
"text": " Aristotle (; \"Aristoteles\", ; 384–322 BC) was a Greek philosopher during the Classical period in Ancient Greece, the founder of the Lyceum and the Peripatetic school of philosophy and Aristotelian tradition. Along with his teacher Plato, he has been called the \"Father of Western Philosophy\". His writings cover many subjects – including physics, biology, zoology, metaphysics, logic, ethics, aesthetics, poetry, theatre, music, rhetoric, psychology, linguistics, economics, politics and government. Aristotle provided a complex synthesis of the various philosophies existing prior to him, and it was above all from his teachings that the West inherited its intellectual lexicon, as well as problems and methods of inquiry. As a result, his philosophy has exerted a unique influence on almost every form of knowledge in the West and it continues to be a subject of contemporary philosophical discussion. Little is known about his life. Aristotle was born in the city of Stagira in Northern Greece. His father, Nicomachus, died when Aristotle was a child, and he was brought up by a guardian. At seventeen or eighteen years of age, he joined Plato's Academy in Athens and remained there until the age of thirty-seven (c. 347 BC). Shortly after Plato died, Aristotle left Athens and, at the request of Philip II of Macedon, tutored Alexander the Great beginning in 343 BC. He established a library in the Lyceum which helped him to produce many of his hundreds of books on papyrus scrolls. Though Aristotle wrote many elegant treatises and dialogues for publication, only around a third of his original",
"idx": "https://en.wikipedia.org/wiki?curid=308",
"title": "Aristotle",
"entity": "Aristotle",
}
Output from the regular model:
Mention Start End Predictions Scores
------------------- ------- ----- ------------------------------------------------------------------------------------ -------------------------------
aristotle 1 10 ['Aristotle', 'Aristotle of Cyrene', 'Plato'] [81.83269 75.88555 75.54469]
aristoteles 15 26 ['Aristotle', 'Aristotle of Argos', 'Aristides'] [78.62669 76.48178 76.190445]
plato 232 237 ['Plato', 'Socrates', 'Aristotle'] [83.59171 77.50141 75.64921]
aristotle 501 510 ['Aristotle', 'Aristotle of Cyrene', 'Plato'] [82.56731 76.41449 76.34714]
little 906 912 ['Alexander the Great in legend', 'Sophistic works of Antiphon', 'Nicias of Nicaea'] [75.37565 74.96681 74.91618]
aristotle 938 947 ['Aristotle', 'Aristotle of Cyrene', 'Aristotle the Dialectician'] [82.74979 78.06471 77.26161]
aristotle 1034 1043 ['Aristotle', 'Aristotle of Cyrene', 'Aristotle of Argos'] [79.39718 75.55984 75.48356]
plato 1143 1148 ['Plato', 'Socrates', 'Plato (comic poet)'] [83.38141 77.72538 75.96175]
plato 1245 1250 ['Plato', 'Socrates', 'Plato (comic poet)'] [82.8144 77.8433 76.76737]
aristotle 1257 1266 ['Aristotle', 'Aristotle of Argos', 'Aristotle the Dialectician'] [80.94 76.36772 76.23486]
alexander the great 1332 1351 ['Alexander the Great', 'Alexander I of Epirus', 'Alexander I of Macedon'] [82.85829 76.547844 76.40962 ]
aristotle 1497 1506 ['Aristotle', 'Aristotle of Cyrene', 'Plato'] [82.41356 76.9357 76.83391]
Output from the quantized model:
Mention Start End Predictions Scores
------------------- ------- ----- ----------------------------------------------------------------------------------- -------------------------------
aristotle 1 10 ['Aristotle', 'Euclid', 'Plato'] [78.56236 76.82237 76.486824]
aristoteles 15 26 ['Aristophanes of Byzantium', 'Diocles of Peparethus', 'Ephorus'] [76.16426 76.133354 76.056305]
plato 232 237 ['Plato', 'Socrates', 'Euclid'] [76.05176 75.021736 74.655815]
aristotle 501 510 ['Aristotle', 'Plato', 'Euclid'] [73.647415 72.97169 72.854645]
little 906 912 ['Maluma', 'George Houghton (disambiguation)', 'John Lewis'] [73.14746 73.02072 73.00591]
aristotle 938 947 ['Aristotle', 'Plato', 'Socrates'] [77.42533 75.779884 75.77082 ]
aristotle 1034 1043 ['Aristotle', 'Alexander, son of Herod', 'Alexander (grandson of Herod the Great)'] [77.33661 77.07514 76.96219]
plato 1143 1148 ['Plato', 'Socrates', 'Aristotle'] [77.73229 76.45869 75.94086]
plato 1245 1250 ['Plato', 'Ramesses II', 'Cyrus the Great'] [76.089645 75.32056 75.00227 ]
aristotle 1257 1266 ['Aristotle', 'Alexander the Great', 'Euclid'] [74.48347 74.2509 73.81971]
alexander the great 1332 1351 ['Alexander the Great', 'Cyrus the Great', 'Darius the Great'] [85.78471 83.313095 82.84473 ]
aristotle 1497 1506 ['Aristotle', 'Euclid', 'Plato'] [75.367905 74.52708 74.46407 ]
I was just wondering if the FB team or anybody else has any experience with compressing BLINK models so as to save on memory usage.
Hi,
I tried using a quantized model for biencoder in
fast
model as follows:The resulting model size is significantly smaller on disk:
I fed the text from the following entity to the regular as well as the quantized model and I get slightly different scores for detected mentions:
Output from the regular model:
Output from the quantized model:
I was just wondering if the FB team or anybody else has any experience with compressing BLINK models so as to save on memory usage.
Thanks!