PetrochukM / PyTorch-NLP

Basic Utilities for PyTorch Natural Language Processing (NLP)
https://pytorchnlp.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.21k stars 257 forks source link

Support loading fasttext model from custom file #61

Open keanpantraw opened 5 years ago

keanpantraw commented 5 years ago

What if I want to use own pretrained fasttext model (or even commoncrawl model instead of standard wiki one)? E.g. look what they publish now: https://fasttext.cc/docs/en/crawl-vectors.html. Current FastText impl

    url_base = 'https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.{}.vec'
    aligned_url_base = 'https://s3.amazonaws.com/arrival/embeddings/wiki.multi.{}.vec'

    def __init__(self, language="en", aligned=False, **kwargs):
        if aligned:
            url = self.aligned_url_base.format(language)
        else:
url = self.url_base.format(language)

doesn't allow you to do such basic stuff.

PetrochukM commented 5 years ago

Hi There! This should be easy to add, please submit a PR if you have the time! Thanks!

tu-artem commented 5 years ago

Hi! isn't it possible to use more general _PretrainedWordVectors here? This class allows loading custom vectors from files or urls. @PetrochukM may be it would be good to make it more user-friendly?

PetrochukM commented 5 years ago

Sure, happy to have it more user-friendly.

Sorry, I do not time to contribute to this project at the moment, trying to run a startup.

PetrochukM commented 5 years ago

Please feel free to send a PR!

karish-grover commented 3 years ago

Hey! I want to give this a try.