dirko / pyhacrf

Hidden alignment conditional random field for classifying string pairs.
BSD 3-Clause "New" or "Revised" License
37 stars 21 forks source link

build error under conda on Windows 7 #22

Closed Shotgunosine closed 9 years ago

Shotgunosine commented 9 years ago

I am trying to use pip to install pyhacrf under conda on Windows 7 and it is giving me the following error. Any help with a work around would be greatly apprecieated.

C:\Users\User\AppData\Local\Continuum\Anaconda>pip install pyhacrf
Collecting pyhacrf
  Using cached pyhacrf-0.0.12.tar.gz
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.9 in c:\users\User\appdata\local\continuum\anaconda\lib\site-p
ackages (from pyhacrf)
Requirement already satisfied (use --upgrade to upgrade): PyLBFGS>=0.1.3 in c:\users\User\appdata\local\continuum\anaconda\lib\si
te-packages (from pyhacrf)
Building wheels for collected packages: pyhacrf
  Running setup.py bdist_wheel for pyhacrf
  Complete output from command C:\Users\User\AppData\Local\Continuum\Anaconda\python.exe -c "import setuptools;__file__='c:\\user
s\\User\\appdata\\local\\temp\\pip-build-rojt7c\\pyhacrf\\setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __f
ile__, 'exec'))" bdist_wheel -d c:\users\User\appdata\local\temp\tmpqnoh1dpip-wheel-:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-2.7
  creating build\lib.win-amd64-2.7\pyhacrf
  copying pyhacrf\feature_extraction.py -> build\lib.win-amd64-2.7\pyhacrf
  copying pyhacrf\pyhacrf.py -> build\lib.win-amd64-2.7\pyhacrf
  copying pyhacrf\state_machine.py -> build\lib.win-amd64-2.7\pyhacrf
  copying pyhacrf\__init__.py -> build\lib.win-amd64-2.7\pyhacrf
  running build_ext
  Looking for python27.dll
  building 'pyhacrf.algorithms' extension
  C compiler: gcc -m64 -g -DNDEBUG -DMS_WIN64 -O2 -Wall -Wstrict-prototypes

  creating build\temp.win-amd64-2.7
  creating build\temp.win-amd64-2.7\Release
  creating build\temp.win-amd64-2.7\Release\pyhacrf
  compile options: '-D__MSVCRT_VERSION__=0x0900 -IC:\\Users\\User\\AppData\\Local\\Continuum\\Anaconda\\lib\\site-packages\\numpy
\\core\\include -IC:\Users\User\AppData\Local\Continuum\Anaconda\lib\site-packages\numpy\core\include -IC:\Users\User\AppData\
Local\Continuum\Anaconda\include -IC:\Users\User\AppData\Local\Continuum\Anaconda\PC -c'
  gcc -m64 -g -DNDEBUG -DMS_WIN64 -O2 -Wall -Wstrict-prototypes -D__MSVCRT_VERSION__=0x0900 -IC:\\Users\\User\\AppData\\Local\\Co
ntinuum\\Anaconda\\lib\\site-packages\\numpy\\core\\include -IC:\Users\User\AppData\Local\Continuum\Anaconda\lib\site-packages\nu
mpy\core\include -IC:\Users\User\AppData\Local\Continuum\Anaconda\include -IC:\Users\User\AppData\Local\Continuum\Anaconda\PC
-c pyhacrf/algorithms.c -o build\temp.win-amd64-2.7\Release\pyhacrf\algorithms.o
  Found executable C:\Users\User\AppData\Local\Continuum\Anaconda\Scripts\gcc.bat
  gcc -m64 -g -shared build\temp.win-amd64-2.7\Release\pyhacrf\algorithms.o -LC:\\Users\\User\\AppData\\Local\\Continuum\\Anacond
a\\lib\\site-packages\\numpy\\core\\lib -LC:\Users\User\AppData\Local\Continuum\Anaconda\libs -LC:\Users\User\AppData\Local\Co
ntinuum\Anaconda\PCbuild\amd64 -lnpymath -lpython27 -lmsvcr90 -o build\lib.win-amd64-2.7\pyhacrf\algorithms.pyd
  Warning: .drectve `/manifestdependency:"type='win32' name='Microsoft.VC90.CRT' version='9.0.21022.8' processorArchitecture='amd64'
 publicKeyToken='1fc8b3b9a1e18e3b'" /DEFAULTLIB:"python27.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized
  Warning: .drectve `/manifestdependency:"type='win32' name='Microsoft.VC90.CRT' version='9.0.21022.8' processorArchitecture='amd64'
 publicKeyToken='1fc8b3b9a1e18e3b'" /DEFAULTLIB:"python27.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized
  C:\\Users\\User\\AppData\\Local\\Continuum\\Anaconda\\lib\\site-packages\\numpy\\core\\lib/npymath.lib(build/temp.win-amd64-2.7
/build/src.win-amd64-2.7/numpy/core/src/npymath/npy_math.obj):(.text+0x2e3): undefined reference to `__imp_modff'
  collect2.exe: error: ld returned 1 exit status
  error: Command "gcc -m64 -g -shared build\temp.win-amd64-2.7\Release\pyhacrf\algorithms.o -LC:\\Users\\User\\AppData\\Local\\Co
ntinuum\\Anaconda\\lib\\site-packages\\numpy\\core\\lib -LC:\Users\User\AppData\Local\Continuum\Anaconda\libs -LC:\Users\User\
AppData\Local\Continuum\Anaconda\PCbuild\amd64 -lnpymath -lpython27 -lmsvcr90 -o build\lib.win-amd64-2.7\pyhacrf\algorithms.pyd" fai
led with exit status 1

  ----------------------------------------
  Failed building wheel for pyhacrf
fgregg commented 9 years ago

Reading this it looks like the problem is in /npy_math.obj

  C:\\Users\\User\\AppData\\Local\\Continuum\\Anaconda\\lib\\site-packages\\numpy\\core\\lib/npymath.lib(build/temp.win-amd64-2.7
/build/src.win-amd64-2.7/numpy/core/src/npymath/npy_math.obj):(.text+0x2e3): undefined reference to `__imp_modff'

Can you look for help from conda on this?

fgregg commented 9 years ago

This suggests that maybe it's a problem in a missing math library: http://www.pgroup.com/userforum/viewtopic.php?t=4031&sid=35ae77e42eac59d17a3d8a16e2e9e6c7

dirko commented 9 years ago

I'm having difficulties with my windows VM so can't immediately try to reproduce. Do you get the same with Python 3?

dirko commented 9 years ago

On a clean Windows 7 I installed Anaconda-2.3.0-Windows-x86_64, then VCForPython27 and then pip install pyhacrf without any problems :/

Shotgunosine commented 9 years ago

Hmm, I didn't know about '''VCForPython27''' until your comment, and I tried installing that, but I'm still getting the same error.

I've tried three or four of the solutions on: http://stackoverflow.com/questions/26140192/microsoft-visual-c-compiler-for-python-2-7, but in every case I am still getting the same error referencing the same gcc command.

I have tried running the pip install command from the Anaconda terminal, the Visual C++ 2008 64-bit Command Prompt, and CMD.exe.

Is it possible that numpy has to be compiled using VCForPython27?

fgregg commented 9 years ago

@Shotgunosine it looks like you are using gcc not msvc, no?

Shotgunosine commented 9 years ago

@fgregg That's what pip keeps defaulting to. How can I force it to use MSVC?

fgregg commented 9 years ago

Well, I would consider it a bug that pyhacrf won't compile under windows with gcc I'm not sure the bug is in pyhacrf.

In general, look at this thread for how to change compilers.

Shotgunosine commented 9 years ago

I think the link didn't post correctly?

fgregg commented 9 years ago

https://github.com/pypa/pip/issues/18

Shotgunosine commented 9 years ago

Alright, going to the directory and manually calling: python setup.py build --compiler=msvc Seems to work, though it produces the following message: No module named msvccompiler in numpy.distutils; trying from distutils

I am able to import pyhacrf, but when I try to run the example code, I get the following error:

In [17]: model.fit(training_X_extracted,training_y)
Exception ValueError: "Buffer dtype mismatch, expected 'long' but got 'long long'" in 'lbfgs._lowlevel.call_eval' ignored

In [18]: predictions = model.predict(training_X_extracted)

In [19]: print(confusion_matrix(training_y,predictions))
[[0 3]
 [0 2]]

In [20]: print(model.predict_proba(training_X_extracted))
[[ 0.5  0.5]
 [ 0.5  0.5]
 [ 0.5  0.5]
 [ 0.5  0.5]
 [ 0.5  0.5]]

So it builds and installs, but doesn't seem to be working correctly, maybe an error with the lbfgs installation?

fgregg commented 9 years ago

No, I think it may be related to this https://github.com/scikit-learn/scikit-learn/issues/1709

fgregg commented 9 years ago

which we would fix in pyhacrf

dirko commented 9 years ago

Cool glad you can compile it. Cannot reproduce the Exception however. @fgregg can you reproduce it?

fgregg commented 9 years ago

No, but I don't have a VM for windows handy.

On Mon, Sep 14, 2015 at 9:40 AM, dirko notifications@github.com wrote:

Cool glad you can compile it. Cannot reproduce the Exception however. @fgregg https://github.com/fgregg can you reproduce it?

— Reply to this email directly or view it on GitHub https://github.com/dirko/pyhacrf/issues/22#issuecomment-140101955.

773.888.2718

dirko commented 9 years ago

Ah okay I get the exception when running in the Anaconda IPython console, but not when running Python from the Command Prompt.

Shotgunosine commented 9 years ago

I'm glad you've been able to reproduce, I am getting the error both when I call it from python in cmd.exe and in an ipython console, but if you can reproduce it in IPython, hopefully you can track it down.

This is probably obvious to you, but by dropping in some 1/0 to break the fit call at various places I've narrowed it down to this line:

final_betas = optimizer.minimize(_objective_copy_gradient,
                                             x0=self.parameters.flatten(),
                                             progress=None)

In other words, the exception is not thrown if I break the code before I reach that line, but it is if I break right after that line.

dirko commented 9 years ago

Cool thanks @Shotgunosine. It seems that if we (in pyhacrf.py) change the dtype of index_array in _construct_sparse_features to 'int32', and states_to_classes in forward_backward also to int32, then it works. So I guess it has something to do with the default dtypes.

I'll have to investigate a bit further so get a fix that will work everywhere - don't have time today so will look at it again tomorrow.

dirko commented 9 years ago

It seems that it is because different platforms give different meanings to int and long etc.

Although fused types is a way around this in Cython, I'm rather going to specify int64 and float64 everywhere because I've had underflow problems with float32s previously.

Shotgunosine commented 9 years ago

Alright, let me know if you'd like me to test a new version after you make changes.

dirko commented 9 years ago

Cool so if you want to try it it is on the consistent-types branch.

(I'm also adding a pull request now)

Shotgunosine commented 9 years ago

Assuming I did everything correctly with git and installation, it seems like the changes do prevent the error from appearing, but I am still getting a different result from the example:

In [14]: print(confusion_matrix(training_y,predictions))
[[0 3]
 [2 0]]

In [15]: print(model.predict_proba(training_X_extracted))
[[ 0.94914812  0.05085188]
 [ 0.92397711  0.07602289]
 [ 0.86756034  0.13243966]
 [ 0.05438812  0.94561188]
 [ 0.02641275  0.97358725]]

Also, just FYI, I don't think that the example runs as written. In this line: print(confusion_matrix(y, predictions)) should y be training_y

fgregg commented 9 years ago

That's because that code is out of date. You are getting what I'm getting

Shotgunosine commented 9 years ago

In that case everything seems to be working. I had to add zip_safe=False to the setup.py because for some reason I was unable to unzip the zipped egg into a temp directory without getting permissions errors. Won't be a problem for people who are installing via pip though.