explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.21k stars 4.4k forks source link

Calling Spacy from Matlab throws errors #13535

Open Saswati-Project opened 4 months ago

Saswati-Project commented 4 months ago

spaCy version 3.7.5

Location C:\Users\cse_s\AppData\Local\Programs\Python\Python312\Lib\site-packages\spacy

Platform Windows-11

Python version 3.12.3

Pipelines en_core_web_lg (3.7.1)

I want to call the Spacy code using Matlab. The Spacy code is as follows which work well using Pycharm IDE.

import spacy
nlp = spacy.load("en_core_web_lg")
doc = nlp("This is a sentence.")

However, the Matlab code throws errors

Error using numpy_ops>init thinc.backends.numpy_ops
Python Error: ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from
PyObject

Error in cupy_ops><module> (line 16)

Error in __init__><module> (line 17)

Error in api><module> (line 1)

Error in compat><module> (line 39)

Error in errors><module> (line 3)

Error in __init__><module> (line 6)

Error in test_spacy><module> (line 1)

Error in <frozen importlib>_call_with_frames_removed (line 228)

Error in <frozen importlib>exec_module (line 850)

Error in <frozen importlib>_load_unlocked (line 680)

Error in <frozen importlib>_find_and_load_unlocked (line 986)

Error in <frozen importlib>_find_and_load (line 1007)

Error in <frozen importlib>_gcd_import (line 1030)

Error in __init__>import_module (line 127)

The Matlab code is

pyenv;
py.importlib.import_module('test_spacy');
path_add = fileparts(which('test_spacy.py'));
if count(py.sys.path, path_add) == 0
    insert(py.sys.path, int64(0), path_add);
end

test_spacy is the python file name that I have. How can I solve this issue?

Siddharth-Latthe-07 commented 4 months ago

@Saswati-Project The error you're encountering is indicative of a binary incompatibility between NumPy and another library, likely one that spaCy depends on, such as Thinc. This can occur when libraries are compiled against different versions of NumPy or when there are mismatched versions of the libraries. You might try these steps and let me know, if it works,

  1. Make sure MATLAB is using the same Python environment where you have your packages installed. You can set the Python environment explicitly in MATLAB.
    pyenv('Version', 'C:\Users\cse_s\AppData\Local\Programs\Python\Python312\python.exe');
  2. Ensure that all the packages (including NumPy, spaCy, and Thinc) are up-to-date and compatible. You can do this in your Python environment.
  3. Test with a sample python script, before running on matlab for consistency,
    
    import spacy

def test_spacy_function(): nlp = spacy.load("en_core_weblg") doc = nlp("This is a sentence.") return [(token.text, token.pos) for token in doc]

4. After confirming the Python script works independently, call it from MATLAB:-
a. Set the Python environment in MATLAB

pyenv('Version', 'C:\Users\cse_s\AppData\Local\Programs\Python\Python312\python.exe');

b. ensure correct path:-

path_add = fileparts(which('test_spacy.py')); if count(py.sys.path, path_add) == 0 insert(py.sys.path, int64(0), path_add); end

finally, import and call the fucntion:-

py.importlib.import_module('test_spacy'); result = py.test_spacy.test_spacy_function(); disp(result)

RubTalha commented 4 months ago

https://stackoverflow.com/questions/78650222/valueerror-numpy-dtype-size-changed-may-indicate-binary-incompatibility-expec