mattjj / pyhsmm

MIT License
546 stars 173 forks source link

setup.py that works with pip and doesn't rely on Cython for source distributions #39

Closed yarden closed 8 years ago

yarden commented 9 years ago

A revised setup.py that: (i) works with pip install, (ii) correctly distributes Cython-generated files when making a source distribution, and (iii) doesn't rely on Cython being installed when installing a source distribution (just compiles the Cython-generated cpp code).

A couple of other changes in this pull request:

  1. Moved deps/ outside of pyhsmm/pyhsmm/
  2. Added a clean option to setup.py that removes Cython-generated code (.cpp, .c, *.so) and build-related files. Intended to be used only by developers - very useful for testing code. Note that this relies on all the .cpp/.c files in pyhsmm/ being Cython-generated.
  3. Removed Cython from being an installation dependency of pyhsmm. It's not necessary for automatic installations like from pypi
  4. Added notes for developers in developer_notes.md - if these are useful they can be incorporated into the README.md or to a separate developer's pyhsmm guide if you make one.

Here's how I tested the source distribution creation/installation:

# get rid of old stuff
python setup.py clean
# make source distribution 
python setup.py sdist 
# send dist/pyhsmm-0.x.x.tar.gz to a fresh box
# ...
# try to install it on fresh box
tar -xzvf pyhsmm-0.x.x.tar.gz
cd pyhsmm-0.x.x/
# this will install by compiling Cython-generated source code, without needing Cython or using *.pyx files at all
pip install .

To test the installation from the github repo, I just did:

# in github repository pyhsmm/
python setup.py clean 
pip install .

This will apply Cython to the *.pyx files to generate the cpp code, and then install the package.

If both of these work, then we're good for both the automatic installation with pip and the developer's build case.

If that sounds good you, I can make a similar fix for pybasicbayes (to make it handle Cython correctly), and then add travis CI support for pyhsmm to make the testing more streamlined. The only missing ingredient then would be a unit testing main driver script.

mattjj commented 9 years ago

Since cythonize checks timestamps, why is having a clean command necessary?

(If it is a good idea, it may be better to remove only .c/.cpp files that correspond to .pyx files instead of assuming all .c/.cpp files should be removed. A good way to do that would be to map a 'clean' function over the extension list that is already created by globbing for .pyx files. But I'd like to understand why it's necessary first.)

mattjj commented 9 years ago

Can you explain what you mean by "(ii) correctly distributes Cython-generated files when making a source distribution"? What does the current setup.py do incorrectly?

yarden commented 9 years ago

The clean does two things: (1) cleans up the build/ files. my experience is that if you try to install with a package manager (e.g. from github repo), it'll create build/ and egg files that trick the package manager into thinking the package is installed after you make changes to code. even if you changed something in the cython, rerunning the package manager won't install it because the package manager has no way of knowing your *.pyx files changed. (2) it wipes out the Cython generated files. I agree that if Cython correctly tracks dependencies with time stamps then it's not necessary, I guess I just don't have as much faith in the tool always getting it right. Before running python setup.py sdist I always want to be sure the Cython-generated files are not from some old run. So it's a precaution. I'm happy to drop that part if you prefer to rely on Cython. I agree that if we keep it then it's better to go based on the glob. Anyway, clean is just for pyhsmm developers as convenience feature, no user would ever run it.

yarden commented 9 years ago

In those three features (including "(ii) correctly distributes Cython-generated files when making a source distribution") I just meant to describe the features the setup.py in that pull request. One difference with the previous version is that my setup.py checks if the *.cpp files are present, and if not, it skips the building of those extensions while warning user. It also does not depend on Cython in any way. I didn't play much with the previous version because I could not install it with pip.

yarden commented 9 years ago

Please hold off on this pull request. Need more extensive testing...

yarden commented 9 years ago

OK, finally ready. I made some changes and it'd be great to get your input.

python -m unittest discover pyhsmm

This line will just run any file anywhere in the package that's named test_*.py. Before, it would run pyhsmm/testing/test_hmm_geweke.py and pyhsmm/testing/test_hmm_likelihood.py but those don't run anything.

Here's a complete log of how I tested this version of the code, first in developer-mode and then as a user of stable release.

Testing as a developer

Starting in the pyhsmm git repository directory.

Clean files (optional):

$ python setup.py clean

now install as developer:

$ python setup.py build_ext --with-cython
Using Cython..
Compiling pyhsmm/internals/hmm_messages_interface.pyx because it changed.
Compiling pyhsmm/internals/hsmm_messages_interface.pyx because it changed.
Compiling pyhsmm/util/cstats.pyx because it changed.
Cythonizing pyhsmm/internals/hmm_messages_interface.pyx
Cythonizing pyhsmm/internals/hsmm_messages_interface.pyx
Cythonizing pyhsmm/util/cstats.pyx
/Users/yarden/anaconda/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running build_ext
building 'pyhsmm.pyhsmm.internals.hmm_messages_interface' extension
creating build/temp.macosx-10.5-x86_64-2.7
creating build/temp.macosx-10.5-x86_64-2.7/pyhsmm
creating build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals
gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Ideps/Eigen3/ -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/internals/hmm_messages_interface.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -std=c++11 -O3 -w -DNDEBUG -DHMM_TEMPS_ON_HEAP
gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/pyhsmm/internals/hmm_messages_interface.so
building 'pyhsmm.pyhsmm.internals.hsmm_messages_interface' extension
gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Ideps/Eigen3/ -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/internals/hsmm_messages_interface.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -std=c++11 -O3 -w -DNDEBUG -DHMM_TEMPS_ON_HEAP
gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/pyhsmm/internals/hsmm_messages_interface.so
building 'pyhsmm.pyhsmm.util.cstats' extension
creating build/temp.macosx-10.5-x86_64-2.7/pyhsmm/util
gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/util/cstats.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/util/cstats.o -O3 -w
gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/util/cstats.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/pyhsmm/util/cstats.so

Since I use virtual environments with pip, I add this to my virtual environment:

$ pip install --edit . 
Obtaining file:///Users/yarden/my_projects/pyhsmm
  Running setup.py (path:/Users/yarden/my_projects/pyhsmm/setup.py) egg_info for package from file:///Users/yarden/my_projects/pyhsmm
    Not using Cython. Building from C/C++ source...
    Making extension pyhsmm.internals.hmm_messages_interface
    Making extension pyhsmm.internals.hsmm_messages_interface
    Making extension pyhsmm.util.cstats

Requirement already satisfied (use --upgrade to upgrade): numpy in /Users/yarden/anaconda/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): scipy in /Users/yarden/anaconda/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): matplotlib in /Users/yarden/anaconda/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): nose in /Users/yarden/anaconda/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): pybasicbayes in /Users/yarden/anaconda/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Installing collected packages: pyhsmm
  Running setup.py develop for pyhsmm
    Not using Cython. Building from C/C++ source...
    Making extension pyhsmm.internals.hmm_messages_interface
    Making extension pyhsmm.internals.hsmm_messages_interface
    Making extension pyhsmm.util.cstats

    building 'pyhsmm.internals.hmm_messages_interface' extension
    gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Ideps/Eigen3 -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/internals/hmm_messages_interface.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.so
    building 'pyhsmm.internals.hsmm_messages_interface' extension
    gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Ideps/Eigen3 -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/internals/hsmm_messages_interface.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.so
    building 'pyhsmm.util.cstats' extension
    gcc -fno-strict-aliasing -I/Users/yarden/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Ideps/Eigen3 -I/Users/yarden/anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/yarden/anaconda/include/python2.7 -I/Users/yarden/anaconda/include/python2.7 -c pyhsmm/util/cstats.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyhsmm/util/cstats.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    gcc -bundle -undefined dynamic_lookup -L/Users/yarden/anaconda/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.5-x86_64-2.7/pyhsmm/util/cstats.o -L/Users/yarden/anaconda/lib -o build/lib.macosx-10.5-x86_64-2.7/pyhsmm/util/cstats.so
    Creating /Users/yarden/anaconda/lib/python2.7/site-packages/pyhsmm.egg-link (link to .)
    pyhsmm 0.1.3 is already the active version in easy-install.pth

    Installed /Users/yarden/my_projects/pyhsmm
Successfully installed pyhsmm
Cleaning up...

Now run the unit tests:

$ python -m unittest discover pyhsmm
Running a test for HMMs...

This demo shows how HDP-HMMs can fail when the underlying data has state
persistence without some kind of temporal regularization (in the form of a
sticky bias or duration modeling): without setting the number of states to be
the correct number a priori, lots of extra states can be intsantiated.

BUT the effect is much more relevant on real data (when the data doesn't exactly
fit the model). Maybe this demo should use multinomial emissions...

.........................  [  25/100,    0.03sec avg, ETA 2.48 ]
.........................  [  50/100,    0.03sec avg, ETA 1.61 ]
.........................  [  75/100,    0.03sec avg, ETA 0.81 ]
.........................  [ 100/100,    0.03sec avg, ETA 0.00 ]

   0.03sec avg, 3.24 total

.........................  [  25/100,    0.03sec avg, ETA 2.46 ]
.........................  [  50/100,    0.03sec avg, ETA 1.68 ]
.........................  [  75/100,    0.03sec avg, ETA 0.82 ]
.........................  [ 100/100,    0.03sec avg, ETA 0.00 ]

   0.03sec avg, 3.19 total

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK

I consider this to mean it's working. We should add more unit tests for better coverage of package, but for now I'm using this as the read out.

Now to test if it works as source distribution like the one that would be downloadable from pypi.

Testing as a user of source distribution/stable release

Clean all previous build/Cython-generated code:

$ python setup.py clean
Cleaning files...
Removing pyhsmm/internals/hmm_messages_interface.cpp
Removing pyhsmm/internals/hsmm_messages_interface.cpp
Removing pyhsmm/util/cstats.cpp
Removing pyhsmm.egg-info
Removing pyhsmm/internals/hmm_messages_interface.cpp
Removing pyhsmm/internals/hsmm_messages_interface.cpp
Removing pyhsmm/util/cstats.cpp
Removing pyhsmm.egg-info
Not using Cython. Building from C/C++ source...
Warning: could not find pyhsmm/internals/hmm_messages_interface.cpp
  - Skipping
Warning: could not find pyhsmm/internals/hsmm_messages_interface.cpp
  - Skipping
Warning: could not find pyhsmm/util/cstats.cpp
  - Skipping
/Users/yarden/anaconda/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running clean
removing 'build/temp.macosx-10.5-x86_64-2.7' (and everything under it)

Then create a source distribution:

$ python setup.py sdist
Using Cython..
Compiling pyhsmm/internals/hmm_messages_interface.pyx because it changed.
Compiling pyhsmm/internals/hsmm_messages_interface.pyx because it changed.
Compiling pyhsmm/util/cstats.pyx because it changed.
Cythonizing pyhsmm/internals/hmm_messages_interface.pyx
Cythonizing pyhsmm/internals/hsmm_messages_interface.pyx
Cythonizing pyhsmm/util/cstats.pyx
/Users/yarden/anaconda/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running sdist
running check
warning: sdist: standard file not found: should have one of README, README.txt

reading manifest template 'MANIFEST.in'
writing manifest file 'MANIFEST'
creating pyhsmm-0.1.3
creating pyhsmm-0.1.3/deps
creating pyhsmm-0.1.3/deps/Eigen3
creating pyhsmm-0.1.3/deps/Eigen3/Eigen
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src/Cholesky
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src/CholmodSupport
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src/Core
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src/Core/arch
creating pyhsmm-0.1.3/deps/Eigen3/Eigen/src/Core/arch/AVX
[...snip...]
creating pyhsmm-0.1.3/pyhsmm
creating pyhsmm-0.1.3/pyhsmm/basic
creating pyhsmm-0.1.3/pyhsmm/examples
creating pyhsmm-0.1.3/pyhsmm/internals
creating pyhsmm-0.1.3/pyhsmm/plugins
creating pyhsmm-0.1.3/pyhsmm/testing
creating pyhsmm-0.1.3/pyhsmm/util
making hard links in pyhsmm-0.1.3...
hard linking setup.py -> pyhsmm-0.1.3
hard linking deps/Eigen3/.hgeol -> pyhsmm-0.1.3/deps/Eigen3
hard linking deps/Eigen3/.hgignore -> pyhsmm-0.1.3/deps/Eigen3
hard linking deps/Eigen3/.hgtags -> pyhsmm-0.1.3/deps/Eigen3
hard linking deps/Eigen3/CMakeLists.txt -> pyhsmm-0.1.3/deps/Eigen3
[...snip...]
hard linking pyhsmm/__init__.py -> pyhsmm-0.1.3/pyhsmm
hard linking pyhsmm/models.py -> pyhsmm-0.1.3/pyhsmm
hard linking pyhsmm/parallel.py -> pyhsmm-0.1.3/pyhsmm
hard linking pyhsmm/basic/__init__.py -> pyhsmm-0.1.3/pyhsmm/basic
hard linking pyhsmm/basic/abstractions.py -> pyhsmm-0.1.3/pyhsmm/basic
hard linking pyhsmm/basic/distributions.py -> pyhsmm-0.1.3/pyhsmm/basic
hard linking pyhsmm/basic/models.py -> pyhsmm-0.1.3/pyhsmm/basic
hard linking pyhsmm/examples/__init__.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/animation.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/concentration-resampling.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/example-data.txt -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hmm-EM.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hmm-separatetrans.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hmm.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hsmm-geo.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hsmm-possiblechangepoints-meanfield.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hsmm-possiblechangepoints-separatetrans.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hsmm-possiblechangepoints.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/hsmm.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/meanfield.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/examples/svi.py -> pyhsmm-0.1.3/pyhsmm/examples
hard linking pyhsmm/internals/__init__.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hmm_messages.h -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hmm_messages_interface.cpp -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hmm_messages_interface.pyx -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hmm_states.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hsmm_inb_states.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hsmm_messages.h -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hsmm_messages_interface.cpp -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hsmm_messages_interface.pyx -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/hsmm_states.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/initial_state.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/nptypes.h -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/transitions.py -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/internals/util.h -> pyhsmm-0.1.3/pyhsmm/internals
hard linking pyhsmm/plugins/__init__.py -> pyhsmm-0.1.3/pyhsmm/plugins
hard linking pyhsmm/testing/__init__.py -> pyhsmm-0.1.3/pyhsmm/testing
hard linking pyhsmm/testing/test_hmm.py -> pyhsmm-0.1.3/pyhsmm/testing
hard linking pyhsmm/testing/test_hmm_geweke.py -> pyhsmm-0.1.3/pyhsmm/testing
hard linking pyhsmm/testing/test_hmm_likelihood.py -> pyhsmm-0.1.3/pyhsmm/testing
hard linking pyhsmm/util/__init__.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/cstats.cpp -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/cstats.pyx -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/general.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/plot.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/profiling.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/setup.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/stats.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/testing.py -> pyhsmm-0.1.3/pyhsmm/util
hard linking pyhsmm/util/text.py -> pyhsmm-0.1.3/pyhsmm/util
Creating tar archive
removing 'pyhsmm-0.1.3' (and everything under it)

Then transfer the source distribution (dist/pyhsmm-0.1.3.tar.gz) to a new box (in my case a Linux box).

The source distribution looks like this:

$ tar -xzvf pyhsmm-0.1.3.tar.gz
...
$ ls pyhsmm-0.1.3
PKG-INFO  deps/  pyhsmm/  setup.py  setuphelper.py

Now install it with pip. This should not depend on Cython, but simply compile the Cython-generated C++:

$ cd pyhsmm-0.1.3
$ pip install --edit . 
You are using pip version 6.0.7, however version 6.0.8 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Obtaining file:///home/unix/yarden/test/pyhsmm-0.1.3
    Not using Cython. Building from C/C++ source...
    Making extension pyhsmm.util.cstats
    Making extension pyhsmm.internals.hmm_messages_interface
    Making extension pyhsmm.internals.hsmm_messages_interface
Requirement already satisfied (use --upgrade to upgrade): numpy in /ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): scipy in /broad/software/free/Linux/redhat_6_x86_64/pkgs/scipy_0.13.0-python-2.7.1-sqlite3-rtrees/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): matplotlib in /broad/software/free/Linux/redhat_6_x86_64/pkgs/matplotlib_1.3.1-python-2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/matplotlib-1.3.1-py2.7-linux-x86_64.egg (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): nose in /broad/software/free/Linux/redhat_6_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): pybasicbayes in /ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages (from pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages (from matplotlib->pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): tornado in /broad/software/free/Linux/redhat_6_x86_64/pkgs/matplotlib_1.3.1-python-2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/tornado-3.2.2-py2.7-linux-x86_64.egg (from matplotlib->pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): pyparsing!=2.0.0,>=1.5.6 in /broad/software/free/Linux/redhat_6_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/pyparsing-1.5.6-py2.7.egg (from matplotlib->pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): six in /ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages (from python-dateutil->matplotlib->pyhsmm==0.1.3)
Requirement already satisfied (use --upgrade to upgrade): backports.ssl-match-hostname in /broad/software/free/Linux/redhat_6_x86_64/pkgs/matplotlib_1.3.1-python-2.7.1-sqlite3-rtrees/lib/python2.7/site-packages/backports.ssl_match_hostname-3.4.0.2-py2.7.egg (from tornado->matplotlib->pyhsmm==0.1.3)
Installing collected packages: pyhsmm
  Running setup.py develop for pyhsmm
    Not using Cython. Building from C/C++ source...
    Making extension pyhsmm.util.cstats
    Making extension pyhsmm.internals.hmm_messages_interface
    Making extension pyhsmm.internals.hsmm_messages_interface
    building 'pyhsmm.util.cstats' extension
    gcc -pthread -fno-strict-aliasing -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/include -DNDEBUG -fPIC -Ideps/Eigen3 -I/ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages/numpy/core/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -c pyhsmm/util/cstats.cpp -o build/temp.linux-x86_64-2.7/pyhsmm/util/cstats.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    g++ -pthread -shared -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9 -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/lib build/temp.linux-x86_64-2.7/pyhsmm/util/cstats.o -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/pyhsmm/util/cstats.so
    building 'pyhsmm.internals.hmm_messages_interface' extension
    gcc -pthread -fno-strict-aliasing -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/include -DNDEBUG -fPIC -Ideps/Eigen3 -I/ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages/numpy/core/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -c pyhsmm/internals/hmm_messages_interface.cpp -o build/temp.linux-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    g++ -pthread -shared -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9 -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/lib build/temp.linux-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.o -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/pyhsmm/internals/hmm_messages_interface.so
    building 'pyhsmm.internals.hsmm_messages_interface' extension
    gcc -pthread -fno-strict-aliasing -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/include -DNDEBUG -fPIC -Ideps/Eigen3 -I/ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages/numpy/core/include -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -I/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/include/python2.7 -c pyhsmm/internals/hsmm_messages_interface.cpp -o build/temp.linux-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -O3 -std=c++11 -DNDEBUG -w -DHMM_TEMPS_ON_HEAP
    g++ -pthread -shared -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/tcltk8.5.9 -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/sqlite_3.7.5/lib -Wl,-rpath,/broad/software/free/Linux/redhat_5_x86_64/pkgs/db_4.7.25/lib build/temp.linux-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.o -L/broad/software/free/Linux/redhat_5_x86_64/pkgs/python_2.7.1-sqlite3-rtrees/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/pyhsmm/internals/hsmm_messages_interface.so
    Creating /ahg/regevdata/users/yarden/myenv/lib/python2.7/site-packages/pyhsmm.egg-link (link to .)
    pyhsmm 0.1.3 is already the active version in easy-install.pth
    Installed /home/unix/yarden/test/pyhsmm-0.1.3
Successfully installed pyhsmm-0.1.3

As the output above says, it's not using Cython, just compiling the Cython-generated code.

Now run the unit tests:

$ python -m unittest discover pyhsmm
Running a test for HMMs...

This demo shows how HDP-HMMs can fail when the underlying data has state
persistence without some kind of temporal regularization (in the form of a
sticky bias or duration modeling): without setting the number of states to be
the correct number a priori, lots of extra states can be intsantiated.

BUT the effect is much more relevant on real data (when the data doesn't exactly
fit the model). Maybe this demo should use multinomial emissions...

.........................  [  25/100,    0.05sec avg, ETA 4.07 ]
.........................  [  50/100,    0.05sec avg, ETA 2.53 ]
.........................  [  75/100,    0.05sec avg, ETA 1.23 ]
.........................  [ 100/100,    0.05sec avg, ETA 0.00 ]

   0.05sec avg, 4.84 total

.........................  [  25/100,    0.05sec avg, ETA 3.53 ]
.........................  [  50/100,    0.05sec avg, ETA 2.34 ]
.........................  [  75/100,    0.05sec avg, ETA 1.17 ]
.........................  [ 100/100,    0.05sec avg, ETA 0.00 ]

   0.05sec avg, 4.67 total

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK

So it looks like it's working. There's one part that I cannot figure out for the life of me, and after beating my head against it for two days I have given up. If in the above installation procedure I do pip install . instead of pip install --edit ., then it compiles but does not properly install the package and the unit test fails, because it can't find the pyx modules. I have no idea why --edit would make this difference -- if you have thoughts, I'd love to know. From the output, it looks like without --edit, pip thinks the package is installed after compilation and it doesn't place it properly in the site-packages directory. I hope that this won't affect users who do pip install pyhsmm. I cannot test whether it does cause a problem in that case because I can't test package installation from pypi, obviously.

yarden commented 9 years ago

Hi Matt, just wondering if you'd had a chance to look at it? Thanks, Yarden

mattjj commented 9 years ago

I've finally got a chance to take a look at this! I made a merge-able version on a local branch.

Here's my attempted re-outline of what this PR does:

  1. move Eigen to the top-level directory
  2. add a clean command to setup.py (Since cythonize should appropriately track pyx file modifications and re-generate sources, I think this is a just-to-be-sure convenience.)
  3. adds two new test scripts which can be run with unittest, though the existing tests in this package and in pybasicbayes are meant to be run with nose
  4. something about pip installation and/or source distributions that I still don't understand!

Can you explain 4? I'm not sure what the setup.py in the current master is lacking; I use it both to build from cython sources and to create source distributions, and installing via pypi seems to work. Can you spell out the use case that isn't covered in master but is patched up by this PR?

I'm going to look into some tweaks to 2 and 3, namely the clean command should probably follow the distutils custom command style and the tests should probably just be nose-runnable.

mattjj commented 8 years ago

I wrote a new setup.py that should address most of these concerns (except it still has crappy default cleaning). I'm going to reread this thread to see what I missed.