exoplanet-dev / celerite2

Fast & scalable Gaussian Processes in one dimension
https://celerite2.readthedocs.io
MIT License
70 stars 11 forks source link

Unable to access pymc3 submodule #89

Open taylorbell57 opened 1 year ago

taylorbell57 commented 1 year ago

Hi folks, I just discovered this updated version of celerite and am excited to add celerite2's PyMC3 GP into the Eureka! package (as I'm finding more and more cases where using a GP would be beneficial). We already have the original celerite GP working (at least kinda) as well as the george GP for standard python minimizers and samplers (emcee, scipy.optimize.minimize, dynesty), but we also have a PyMC3 version of our fitting code which allows one to use starry (ideal for eclipse mapping) and in general PyMC3's NUTS sampler has allowed for much faster fits (for me at least). I know that jax and PyMC4 offer advantages over the deprecated PyMC3 and theano implementation, but over the past year I've already spent more than a fifty hours on getting PyMC3 versions of all our astrophysical and systematic models implemented, I need to be able to use starry for the astrophysical model, and I'm looking for a fairly quick way to get GPs implemented. I was about to try PyMC3's built-in GP, but I saw on the old exoplanet docs that exoplanet had a faster GP implementation for PyMC3 sampling and then noticed that the code had been migrated here to celerite2.

However, by default I am unable to access the pymc3 submodule of celerite2 that is mentioned in the documentation. I've tried installing the version on main branch on GitHub (seeing that v0.2.1 doesn't have pymc3 code), but I get an error that 'celerite2' has no attribute 'pymc3':

> pip install celerite2[pymc3]@git+https://github.com/exoplanet-dev/celerite2
> ipython
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:23:19) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.14.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import celerite2
c
In [2]: celerite2.__version__
Out[2]: '0.3.0rc2.dev11+g74b0705'

In [3]: celerite2.pymc3
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 celerite2.pymc3

AttributeError: module 'celerite2' has no attribute 'pymc3'

Looking at the package on GitHub I see the pymc3 folder under python/celerite2/pymc3, but it seems it isn't being imported in the __init__.py file. Just adding from celerite2 import pymc3 to the python/celerite2/__init__.py file wasn't enough to solve the problem either, giving me the following error message when installing:

Building wheels for collected packages: celerite2
  Building wheel for celerite2 (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for celerite2 (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [27 lines of output]
      running bdist_wheel
      running build
      running build_py
      copying python/celerite2/__init__.py -> build/lib.macosx-10.9-x86_64-cpython-39/celerite2
      copying python/celerite2/celerite2_version.py -> build/lib.macosx-10.9-x86_64-cpython-39/celerite2
      running egg_info
      writing python/celerite2.egg-info/PKG-INFO
      writing dependency_links to python/celerite2.egg-info/dependency_links.txt
      writing requirements to python/celerite2.egg-info/requires.txt
      writing top-level names to python/celerite2.egg-info/top_level.txt
      reading manifest template 'MANIFEST.in'
      warning: no directories found matching 'c++/vendor/eigen/Eigen'
      adding license file 'LICENSE'
      writing manifest file 'python/celerite2.egg-info/SOURCES.txt'
      running build_ext
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /Users/tjbell1/miniconda3/envs/eureka_starry/include -fPIC -O2 -isystem /Users/tjbell1/miniconda3/envs/eureka_starry/include -I/Users/tjbell1/miniconda3/envs/eureka_starry/include/python3.9 -c flagcheck.cpp -o flagcheck.o -std=c++17
      building 'celerite2.driver' extension
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /Users/tjbell1/miniconda3/envs/eureka_starry/include -fPIC -O2 -isystem /Users/tjbell1/miniconda3/envs/eureka_starry/include -Ic++/include -Ic++/vendor/eigen -Ipython/celerite2 -I/private/var/folders/f3/_h5lxm511fv9sjb5d3xrj0jm0000gq/T/pip-build-env-u9cav0rq/overlay/lib/python3.9/site-packages/pybind11/include -I/Users/tjbell1/miniconda3/envs/eureka_starry/include/python3.9 -c python/celerite2/driver.cpp -o build/temp.macosx-10.9-x86_64-cpython-39/python/celerite2/driver.o -std=c++17 -mmacosx-version-min=10.14 -fvisibility=hidden -g0
      In file included from python/celerite2/driver.cpp:6:
      In file included from python/celerite2/driver.hpp:8:
      In file included from c++/include/celerite2/celerite2.h:4:
      In file included from c++/include/celerite2/core.hpp:4:
      c++/include/celerite2/forward.hpp:4:10: fatal error: 'Eigen/Core' file not found
      #include <Eigen/Core>
               ^~~~~~~~~~~~
      1 error generated.
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for celerite2
Failed to build celerite2
ERROR: Could not build wheels for celerite2, which is required to install pyproject.toml-based projects

I was able to find the referenced Eigen package at https://eigen.tuxfamily.org/, and I downloaded the latest version and put the entire contents of that package in celerite2's (previously empty) c++/vendor/eigen folder. With that package downloaded and my edit to the __init__.py file, I'm now able to import and use celerite2.pymc3.

I thought I'd document my troubleshooting process here to help anyone else trying to do the same thing or to help the devs figure out what changes need to be made to the documentation and/or package.

vandalt commented 1 year ago

Hi @taylorbell57!

A few pointers that might help:

If installing from the main branch with python -m pip install ".[pymc3]", import celerite2.pymc3 should work because it is a subdirectory of celerite2.

Regarding the Eigen error, I'm unable to reproduce (I remember having to install that library at some point in the past, so maybe that's why), but for me using the above instructions and and older version of Python (3.8) helped. With Python 3.11 pip was trying to rebuild numpy and that crashed...

EDIT: I just tried building from a clean clone and I do get the Eigen error, so I probably had done the steps you described above in the past because the directory was not empty on my older clone.

taylorbell57 commented 1 year ago

Hi @vandalt, nice to bump into you here!

  1. I did make sure I added [pymc3] as you described (as you can see at the top of my first code snippet above)
  2. Ah, I didn't realize there was a switch from import celerite2.theano to import celerite2.pymc3, so that could explain a part of my issue but not all (see next point)
  3. The issue is that the pymc3 sublibrary is not imported anywhere in the package, so when doing pip install pip doesn't know to copy that folder into the installed location (which will be something like ~/miniconda3/envs/YOUR_ENVIRONMENT/lib/python3.9/site-packages/celerite2/). If you look at that installed folder, you'll see that the pymc3 folder is missing. If you end up importing celerite2 from the cloned repo folder instead of from your installed directory (e.g. check where celerite2.__file__ points), then it may be able to import celerite2.pymc3 which might be why the bug has gone unnoticed so far.
dfm commented 1 year ago

Hi all — Sorry I missed this before! I'm mostly on parental leave still.

The issue is that the pymc3 sublibrary is not imported anywhere in the package, so when doing pip install pip doesn't know to copy that folder into the installed location (which will be something like ~/miniconda3/envs/YOUR_ENVIRONMENT/lib/python3.9/site-packages/celerite2/). If you look at that installed folder, you'll see that the pymc3 folder is missing. If you end up importing celerite2 from the cloned repo folder instead of from your installed directory (e.g. check where celerite2.__file__ points), then it may be able to import celerite2.pymc3 which might be why the bug has gone unnoticed so far.

This isn't true! setuptools_scm automatically discovers submodules using git. If you look at the source of the distribution on pypi, you'll see that the appropriate submodules are included and they're not imported anywhere. This is by design - import of one of the submodules shouldn't fail because of a missing dependency for a different one!

The issue that you're seeing seems to be related to the renaming of the module. (Annoying, I know! Maintaining a package through the implosion of the PyMC dev community hasn't been not annoying either :/ Sorry!)

For the Eigen issue: You're getting that because Eigen is included as a git submodule so you'll need to clone recursively as described here: https://celerite2.readthedocs.io/en/latest/user/install/#from-source

taylorbell57 commented 1 year ago

@dfm thanks for chiming in! I'm interested to know that setuptools_scm can do that — I ran into a similar issue with our Eureka! code and had to add a bunch of annoying try/except statements to allow partial installations. I'll definitely look more into that later! And thanks for pointing me to that installation point about Eigen!

I think some confusion had come from the theano->pymc3 change, but I noticed that there is still an issue even if I clone recursively and do pip install '.[pymc3]'. In particular, I am able to do

> import celerite2.pymc3
> celerite2.pymc3.GaussianProcess(...)

But I cannot do

> import celerite2
> celerite2.pymc3.GaussianProcess(...)

Is that supposed to be the intended behaviour?

dfm commented 1 year ago

Great! Thanks for the update. Yes, this is the expected behavior... the idea being that submodules are only "enabled" when they are explicitly imported.

taylorbell57 commented 1 year ago

Gotcha, thanks for the clarification! You can close this issue now or with the merging of your PR - up to you