nrcpp / NltkNet

NLTK library wrapper for .NET
MIT License
46 stars 8 forks source link

NumPy/SciPy not supported by IronPython #2

Open nrcpp opened 5 years ago

nrcpp commented 5 years ago

Here is the instruction of how to setup NumPy/SciPy for IronPython:

Install NumPy/SciPy libraries for IronPython

NumPy library is used by some NLTK functions such as nltk.ne_chunk() (Named Entity Recognition). So, follow this steps to install NumPy library for IronPython:

  1. Download ironpkg-1.0.0.py from https://raw.githubusercontent.com/ptmono/anipang_b/master/ironpkg-1.0.0.py to IronPython folder
  2. Run console and cd to IronPython folder there. So you will be able to run ipy ironpkg-1.0.0.py --install
  3. After ironpkg and ironegg was installed, run ironpkg scipy to install NumPy and SciPy libraries.
  4. In case you've got errors on installation, then download numpy and scipy egg files from here http://code.enthought.com/.iron/eggs/index.html
  5. Then run ironegg scipy-1.0.0-2.egg and ironegg numpy-2.0.0-1.egg (versions may be changed)
  6. Make sure you've got folders <IronPython Path>\Libs\site-packages\scipy and <IronPython Path>\Libs\site-packages\numpy

Follow this StackOverflow question for actual information.

Issue

After installation you'll get expception related to NumPy library on call nltk.ne_chunk() from IronPython. This is not NltkNet issue itself. This means NumPy library is not supported by IronPython. I assume because NumPy has native C-calls inside it.

Please let me know, if you got success or workarround by running NumPy with IronPython.

saifu-rahman commented 5 years ago

I was on IronPython 2.7.9,

After doing above steps, Now,

"Nltk.Init(new List { @"C:\IronPython279\Lib", // Path to IronPython standard libraries @"C:\IronPython279\Lib\site-packages", // Path to IronPython third-party libraries @"C:\IronPython279\DLLs", }); "

throws error "Could not load file or assembly 'IronPython, Version=2.7.0.40, Culture=neutral, PublicKeyToken=7f709c5b713576e1' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040) System.IO.FileLoadException: Could not load file or assembly 'IronPython, Version=2.7.0.40, Culture=neutral, PublicKeyToken=7f709c5b713576e1' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)"

nrcpp commented 5 years ago

@saifu-rahman As I see you have IronPython version 2.7.0 which may be installed to another path than C:\IronPython279. Check that path and/or update IronPython nuget packages. This exception doesn't relate to the issue directly though.

saifu-rahman commented 5 years ago

My Ironpython version is 2.7.9 and is installed in directory "C:\IronPython279". Also I tried restoring my original "IronPython279" backup folder in C drive, then, executing below line of code, throws my original numpy exception:

C# Code: dynamic chunkList = Nltk.Py.CallMethod(ne_chunkObj, "ne_chunk", taggedWordTokens.AsDynamic);

Exception: No module named numpy.core.multiarray IronPython.Runtime.Exceptions.ImportException: No module named numpy.core.multiarray

So my conclusion here is, after running the solution commands, I got new folders: 'IronPython279\Libs\site-packages\scipy' and 'IronPython279\Libs\site-packages\numpy', but seems like numpy import (which is a dependent for NLTK) in my C# program is expecting a different version of IronPython??

ipy ironpkg-1.0.0.py --install ironpkg scipy ironegg scipy-1.0.0-2.egg ironegg numpy-2.0.0-1.egg

GeorgeS2019 commented 2 years ago

Perhaps relevant