blueprints-for-text-analytics-python / blueprints-text

Jupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Apache License 2.0
250 stars 141 forks source link

Ch10 - Loading a pretrained model - not working on Colab #23

Closed jgammerman closed 1 year ago

jgammerman commented 1 year ago

Hello,

Greatly enjoying the book, but I've encountered an error on Colab when running one of the first cells of chapter 10 (section: Loading a pretrained model):

import gensim.downloader as api

info_df = pd.DataFrame.from_dict(api.info()['models'], orient='index') info_df[['file_size', 'base_dataset', 'parameters']].head(5)

Yields the following error:


ValueError                                Traceback (most recent call last)
[/content/setup.py](https://localhost:8080/#) in <module>
----> 1 import gensim.downloader as api
      2 
      3 info_df = pd.DataFrame.from_dict(api.info()['models'], orient='index')
      4 info_df[['file_size', 'base_dataset', 'parameters']].head(5)

5 frames
/usr/local/lib/python3.8/dist-packages/gensim/_matutils.pyx in init gensim._matutils()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject```
jsalbr commented 1 year ago

Hi James,

it's one of these nasty Python dependency issues. The current gensim 4.3 seems to be incompatible with the pre-installed numpy 1.12. I did not find a quick elegant solution, but can provide a workaround:

Once you open the notebook in colab and before running the setup, create a new cell and execute the following command:

!pip uninstall -y numpy

Then restart the runtime. After that everything should work fine, the correct numpy version will be installed automatically.

Please note, that I made some minor fixes to the notebook and embeddings package to make it compatible with gensim 4.3. So if you already cloned the repo, you should do another pull.

Leave us a comment on Amazon, if you like the book!

Best, Jens

jgammerman commented 1 year ago

Yep that worked for me. Thanks for the prompt response Jens!

I've left a glowing review on Amazon :)