n-waves / multifit

The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761
MIT License
282 stars 56 forks source link

multifit does'nt work on Google Colab #81

Closed andreagrusso closed 3 years ago

andreagrusso commented 3 years ago

Hi! I am trying to use multifit on Google Colab but I encounter a problem in the installation and then in the use of multifit. This my code for the installation: !git clone https://github.com/n-waves/multifit.git !python multifit/setup.py install and this is the output:

Cloning into 'multifit'... remote: Enumerating objects: 1807, done. remote: Total 1807 (delta 0), reused 0 (delta 0), pack-reused 1807 Receiving objects: 100% (1807/1807), 1.36 MiB | 14.61 MiB/s, done. Resolving deltas: 100% (1171/1171), done. running install running bdist_egg running egg_info creating multifit.egg-info writing multifit.egg-info/PKG-INFO writing dependency_links to multifit.egg-info/dependency_links.txt writing top-level names to multifit.egg-info/top_level.txt writing manifest file 'multifit.egg-info/SOURCES.txt' reading manifest file 'multifit.egg-info/SOURCES.txt' writing manifest file 'multifit.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib warning: install_lib: 'build/lib' does not exist -- no Python modules to install

creating build creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/EGG-INFO copying multifit.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying multifit.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying multifit.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying multifit.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO zip_safe flag not set; analyzing archive contents... creating dist creating 'dist/multifit-1.0-py3.6.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing multifit-1.0-py3.6.egg Copying multifit-1.0-py3.6.egg to /usr/local/lib/python3.6/dist-packages Adding multifit 1.0 to easy-install.pth file

Installed` /usr/local/lib/python3.6/dist-packages/multifit-1.0-py3.6.egg Processing dependencies for multifit==1.0 Finished processing dependencies for multifit==1.0

The import of the package works well but when I try to use multifit.from_pretrained("..."):

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last)

in () ----> 1 multifit.from_pretrained AttributeError: module 'multifit' has no attribute 'from_pretrained'

What am I doing wrong?

Thanks in advance, Andrea

mkardas commented 3 years ago

Hi Andrea,

I guess that instead of importing the multifit module you've somehow imported the repository directory. Can you verify that by inspecting multifit.__path__?

andreagrusso commented 3 years ago

Hi, thank you for your reply! I think that you're right as multifit.__path__ returns

_NamespacePath(['/content/multifit'])

and inspecting the above path I got exactly the repository directory

fastai_contrib
prepare_cls.py
prepare_xnli.py
sotabench get_preprocessed_wikis.sh
prepare_imdb.sh
prepare_xnli.sh split-cls.py LICENSE
prepare_mldoc.py
README.md multifit
prepare_wiki-en.sh
requirements.txt notebooks
prepare_wiki.sh
setup.py

I've tried again to install setup.py as I did on my local machine !python /content/multifit/setup.py install but the problem persists. I don't understand what I am doing wrong.

andreagrusso commented 3 years ago

I've resolved the issue by using: !pip install git+https://github.com/n-waves/multifit.git !pip install sacremoses

I don't know what is the difference with the other way but now it works. I hope it will help other users!

mkardas commented 3 years ago

It depends on what's on your $PYTHON_PATH. Installing directly from repository works as the repository directory is not cloned into your current working directory.