plasticityai / magnitude

A fast, efficient universal vector embedding utility package.
MIT License
1.62k stars 119 forks source link

Torch dependency issues #72

Open davidmezzetti opened 4 years ago

davidmezzetti commented 4 years ago

Thank you for this package, it's a great idea and still very useful for those using Word2vec/GloVe/fastText. The performance gains from having vectors in SQLite and not having to pin the entire model into memory is great.

Currently, torch 0.4.1 is being installed regardless if torch is already installed and it's overriding much newer versions of torch causing environment issues.

The project would be much simpler by at least making ELMo optional (mentioned in #59) or even removing support. As an example, I have created a fork of Magnitude doing that. Removing ELMo support allows removing most third party dependences as 999 columns works for everything but ELMo. Build times on pip install goes from 5+ minutes to seconds on my local machine.

You can make the argument that if you'd like more complex embeddings, looking towards Transformer models is a better approach than ELMo. But there still are use cases for simple word embeddings and keeping this library simple to maintain/update would be great.

riven314 commented 4 years ago

As a remark to the torch dependency matter, it caused an issue to my conda environment. I realized this setting when I found that my torch cuda was suddenly disabled I later found that it was caused by my torch being replaced by 0.4.1.post2 during pymagnitude installation