the-virtual-brain / tvb-root

Main TVB codebase
https://thevirtualbrain.org
Other
121 stars 102 forks source link

Tensor backend for PyMC #630

Closed e-richter closed 2 months ago

e-richter commented 1 year ago

This PR implements a TVB backend for a tensor library compatible with PyMC, previously Theano, now Aesara, future PyTensor. Actually using the backend productively with PyMC still requires some work on the user side.

maedoc commented 1 year ago

I fixed the theano dependency problem for the lib-tests. There're two remaining failures that I'm not sure about.

maedoc commented 1 year ago

Regarding the failing tests, do you have e.g. a dockerfile in which theano works correctly and tests are passing? I could try to massage the differences so the tests pass here?

maedoc commented 1 year ago

I had to setup a new environment withpip install aesara pytest numpy autopep8 numba mako tvb_data in order to get some tests passing. I also replaced import theano with import aesara. Do you want to switch over to aesara since it's the newer library?

Also, the tests seem to take a long time to run, is that expected?

maedoc commented 1 year ago

The tests now pass in a container, let's see what's caught by the CI runs.

e-richter commented 1 year ago

Thank you for taking care. I am not sure if pymc3 works with aesara. But you are right, we shouldn't use theano in the future, because it is not maintained.

e-richter commented 1 year ago

I just tried out out the current pymc3 implementation in tvb-inversion with the backend using aesara and works without problems. So we should continue with aesara for the backend.

maedoc commented 1 year ago

I am not sure if pymc3 works with aesara. But you are right, we shouldn't use theano in the future, because it is not maintained.

I wasn't able to get a combination of theano, numba and numpy for tvb working, but I agree aesara is the next step and PyMC devs just announced PyTensor which we can switch to when it's ready. The goal is to enable TVB PyMC compatibility so the underlying tensor lib isn't really important.

I just tried out out the current pymc3 implementation in tvb-inversion with the backend using aesara and works without problems. So we should continue with aesara for the backend.

Thanks for testing! and also for the demo notebooks, looks nice. As soon as the current build failures are taken care of, we'll merge this PR and have it in the next TVB release.

maedoc commented 1 year ago

Aesara seems to hard code Python header locations, I couldn't override with CFLAGS. On macOS with only the command line tools installed and not XCode, a hacky hack required:

sudo mkdir -p /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9
sudo ln -s /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/Headers /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Headers

and the tests pass but then git refuses to run.. so have to sudo rm -r /Applications/Xcode.app to commit 😓

maedoc commented 1 year ago

After reverting the modifications to the oscillator model (which you can also include when you actually use the model), I have test failures which occur when evaluating the simulation graph e.g.

2022-12-12 10:01:08,402 - ERROR - aesara.graph.rewriting.basic - Rewrite failure due to: constant_folding
2022-12-12 10:01:08 [   ERROR] Rewrite failure due to: constant_folding (basic.py:1768)
2022-12-12 10:01:08,402 - ERROR - aesara.graph.rewriting.basic - node: Subtensor{int64, int64}(TensorConstant{[[[-0.7751..173383 ]]]}, ScalarConstant{2}, ScalarConstant{0})
2022-12-12 10:01:08 [   ERROR] node: Subtensor{int64, int64}(TensorConstant{[[[-0.7751..173383 ]]]}, ScalarConstant{2}, ScalarConstant{0}) (basic.py:1769)
2022-12-12 10:01:08,402 - ERROR - aesara.graph.rewriting.basic - TRACEBACK:
2022-12-12 10:01:08 [   ERROR] TRACEBACK: (basic.py:1770)
2022-12-12 10:01:08,402 - ERROR - aesara.graph.rewriting.basic - Traceback (most recent call last):
  File "/Users/duke/.local/share/virtualenvs/tvb_library-ii1LiA21/lib/python3.9/site-packages/aesara/graph/rewriting/basic.py", line 1933, in process_node
    replacements = node_rewriter.transform(fgraph, node)
  File "/Users/duke/.local/share/virtualenvs/tvb_library-ii1LiA21/lib/python3.9/site-packages/aesara/graph/rewriting/basic.py", line 1092, in transform
    return self.fn(fgraph, node)
  File "/Users/duke/.local/share/virtualenvs/tvb_library-ii1LiA21/lib/python3.9/site-packages/aesara/tensor/rewriting/basic.py", line 1142, in constant_folding
    required = thunk()
  File "/Users/duke/.local/share/virtualenvs/tvb_library-ii1LiA21/lib/python3.9/site-packages/aesara/link/c/op.py", line 103, in rval
    thunk()
  File "/Users/duke/.local/share/virtualenvs/tvb_library-ii1LiA21/lib/python3.9/site-packages/aesara/link/c/basic.py", line 1788, in __call__
    raise exc_value.with_traceback(exc_trace)
IndexError: index out of bounds

despite setting aesara.config.mode='DebugMode' there's not really any useful information here. @e-richter @dionperd can you help with this? How are you debugging the model code?

A second comment is that the time loop needs to be written as a scan, otherwise the graph will be enormous: the deterministic Euler model with the oscillator is 135 graph nodes per time step.

e-richter commented 1 year ago

Does this error appear for a specific line of the templates? I ask this, because the .eval() function for getting n_node in the coupling template (theano-coupling.py.mako) could cause this error. You are right about the scan function. We also use this for loops in pymc3. I will change this.

maedoc commented 1 year ago

Does this error appear for a specific line of the templates?

No, there's no further lines in the traceback, but stepping through the unit test, it happens on yh.eval() in the sim tests. I suppose it's an indexing error somewhere in the template code, and I fixed one but there seems to be another I didn't find yet.

e-richter commented 1 year ago

Alright, thank you. Could it be due to the changes of the oscillator variables of interest? I can have a look at it while implementing the scan function for looping.

maedoc commented 1 year ago

Alright, thank you. Could it be due to the changes of the oscillator variables of interest?

Certainly, I removed those changes because we prefer not to change the historical defaults unless really necessary, and many basic tests of the simulator were written assuming that list hasn't changed, so I reverted that. In usage, you can do

osc = Oscillator(variables_of_interest=["V", "W"])

and since that broke the test here, it suggests that something was hard coded that should not be.

e-richter commented 1 year ago

The sim tests are now passing. The issue was that the initial state for simulation was created using the history buffer, which only stores the cvars. This was no problem before, because both state variables were used as cvars (I don't remember why I changed it for the oscillator model). I still need to implement the scan function for integration now.

maedoc commented 3 months ago

@e-richter do you want to fix this up or shall we close the PR? while a autograd tensor backend for TVB is worthwhile, I have been more focused on Jax and odn't really want to support two such libraries.

e-richter commented 2 months ago

@e-richter do you want to fix this up or shall we close the PR? while a autograd tensor backend for TVB is worthwhile, I have been more focused on Jax and odn't really want to support two such libraries.

Thank you for asking. We can close this PR as we also switched the focus in our lab.