facebookresearch / PyTorch-BigGraph

Generate embeddings from large-scale graph-structured data.
https://torchbiggraph.readthedocs.io/
Other
3.36k stars 447 forks source link

Does PGB support post-training quantization? #156

Open yazdavar opened 4 years ago

yazdavar commented 4 years ago

I am wondering if PGB support Post-training quantization for instance what we we have for fasttext: https://flavioclesio.com/2019/03/22/post-training-quantization-in-fasttext-or-how-to-shrink-your-fasttext-model-in-90/

lw commented 4 years ago

No, that's not something we ship, you'd have to do it yourself. Depending on what you're looking for, FAISS may also help, as I think I remember it has some compression/quantization techniques.

yazdavar commented 4 years ago

Thanks a lot for your response. I am trying to load pretrained Wikida embedding for finding neighbor's entity but loading the pretrained wiki data embedding needed more than 100GB of RAM. I was wondering if there is any other way (e.g. like the quantization for fast text) to make the script more efficient.

lw commented 4 years ago

If you're looking for nearest neighbors take a look at FAISS, I think it has tools to deal with mmapped files (so that only some chunks at a time are loaded from disk to memory) and with compression.