yannvgn / laserembeddings

LASER multilingual sentence embeddings as a pip package
BSD 3-Clause "New" or "Revised" License
224 stars 29 forks source link

Use fastBPE package available from pypi #9

Closed chiragjn closed 4 years ago

chiragjn commented 4 years ago

Firstly, Thank you for creating this pure python port.

fastBPE is now available on PyPI and needs no compiling! https://pypi.org/project/fastBPE/

So I went ahead and did some changes to use that instead of fastBPE for my own usages. Let me know if this is good or can be improved to merge upstream

yannvgn commented 4 years ago

Thanks for your contribution, @chiragjn!

It's true that fastBPE is now available on PyPI, however I have a few concerns:

For example, on macOS, running pip install fastBPE fails:

  fastBPE/fastBPE.cpp:603:10: fatal error: 'ios' file not found
  #include "ios"
           ^~~~~
  1 warning and 1 error generated.
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for fastBPE

You'll also get an error if you don't have a compiler like gcc installed.

Switching to fastBPE may make laserembeddings technically closer to Facebook's LASER, but from an end-user perspective there would be no differences (besides an additional compilation step during the installation). The main goal of laserembeddings is to be as easy to install as possible, and also as portable as possible. Provided you have a PyTorch installation, pip install laserembeddings should work regardless of the system you're working on.

So I would wait until fastBPE has a more flexible installer / has wheels available for common systems.

Regarding the typos, thanks for spotting them! Either open a new PR, or revert the fastBPE-related changes and rename this PR.

Cheers

yannvgn commented 4 years ago

I'm merging this PR and I'll revert the fastBPE-related changes in another PR. Thank you @chiragjn 👍