perrin-isir / xpag

a modular reinforcement learning library with JAX agents
BSD 3-Clause "New" or "Revised" License
22 stars 5 forks source link

Build error for pytinyrenderer wheel #4

Closed stephane-caron closed 1 year ago

stephane-caron commented 1 year ago

I ran into the following error while following the conda installation instructions:

      third_party/pybind11/include/pybind11/cast.h:446:36: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
        446 |                 "  " + handle(frame->f_code->co_filename).cast<std::string>() +
            |                                    ^~

Here is the full log: full-log.txt

This seems to be an issue with upstream dependencies not being fully ready for Python 3.11? :thinking:

stephane-caron commented 1 year ago

Alternatively the environment could install more things from conda-forge, e.g. numpy, pytinyrenderer, ... Those should work together fine (and be verified by package maintainers rather than downstream users :sweat_smile:)

stephane-caron commented 1 year ago

This seems to be an issue with upstream dependencies not being fully ready for Python 3.11? thinking

Just tried with python=3.9 and the wheel builds fine there.

perrin-isir commented 1 year ago

I tried with python=3.11 and got no issue, but, strangely, the environment ended up having python version 3.10.9.

Then I tried to follow your suggestion to install more things from conda-forge, so I exchanged the defaults and conda-forge channels in environment.yaml, and I got the exact same issue as you did. I'm trying to understand what happened.

stephane-caron commented 1 year ago

Okay, this makes sense. I have conda-forge configured in my .condarc channels, it likely has the same effect as swapping channels in environment.yaml.

perrin-isir commented 1 year ago

Yes, definitely. Just installing pytinyrenderer from conda-forge in python 3.11 (conda install -c conda-forge pytinyrenderer) works fine, so the problem comes from somewhere else.

stephane-caron commented 1 year ago

I got a good result from this one:

name:
channels:
    - conda-forge
dependencies:
    - Pillow >=9.0.1
    - brax >=0.0.10
    - conda >=4.11.0
    - gym >=0.26.0
    - ipywidgets >=7.6.5
    - jax >=0.3.23
    - joblib >=1.1.0
    - matplotlib >=3.1.3
    - mediapy  >= 1.1.4
    - numpy >=1.21.5
    - optax >=0.1.2
    - pip >=21.3.1
    - psutil >=5.8.0
    - python >=3.9
    - tensorflow-probability >=0.15.0
    - pip:
        - flax >=0.6.3

Where by "good", I mean:

$ conda activate xpagenv
$ python -c "import jax; print(jax.lib.xla_bridge.get_backend().platform)"
gpu
perrin-isir commented 1 year ago

That's nice, I didn't manage to get a direct conda install of jax on gpu up to now! About the previous problem: there's directly an issue with brax in python 3.11. If you create a 3.11 environment with conda create -n testenv python=3.11, and then try to pip install brax in it, you get the typedef struct _frame PyFrameObject error. You don't get this error in python 3.10 (with conda create -n testenv python=3.10). So for the moment I will:

perrin-isir commented 1 year ago

The result of your suggestion is very nice, but for some strange reason conda is taking ages to solve the environment... I guess it's faster with mamba. I didn't want to impose mamba to the users, but well, this might be the best solution.

perrin-isir commented 1 year ago

For the moment I ended up with an intermediate solution :

name:
channels:
- conda-forge
dependencies:
- python>=3.9,<3.11
- pip>=21.3.1
- conda>=4.11.0
- jax>=0.3.23
- psutil>=5.8.0
- numpy>=1.21.5
- matplotlib>=3.1.3
- joblib>=1.1.0
- Pillow>=9.0.1
- ipywidgets>=7.6.5    
- mediapy >= 1.1.4
- pip:
    - tensorflow-probability>=0.15.0
    - gym>=0.26.0
    - optax>=0.1.2
    - brax>=0.0.10
    - flax>=0.6.3

because otherwise, when running an example, I had issues with the jax substrate of tensorflow_probability (possibly due to circular imports).

stephane-caron commented 1 year ago

I tried this last environment, it's just missing tensorflow. I additionally installed it by:

micromamba install tensorflow  # it picked 2.11.0

Then this snippet (reproducing the substrate error) worked :ok_hand:

perrin-isir commented 1 year ago

I think installing tensorflow is not a good solution, because nothing in xpag requires tensorflow, and it's pretty big so I'd rather avoid adding this dependency. With the JAX substrate, tensorflow_probability should work well without installing tensorflow (cf TensorFlow_Probability_on_JAX).

stephane-caron commented 1 year ago

You're right, I removed tensorflow, the reduced snippet still works out:

import tensorflow_probability
import jax
print(f"{jax.__version__=}")
print(f"{tensorflow_probability.__version__=}")

import tensorflow_probability.substrates.jax as tfp
print("loaded!")

The substrate error is solved and this moves my installation to https://github.com/perrin-isir/xpag/issues/5 as well. So, as far as I can tell this issue is solved, thanks :slightly_smiling_face: