matthewwardrop / formulaic

A high-performance implementation of Wilkinson formulas for Python.
MIT License
313 stars 21 forks source link

Incompatibility with pandas development version #185

Open MatthiasSchmidtblaicherQC opened 2 months ago

MatthiasSchmidtblaicherQC commented 2 months ago

Just a heads-up that current model matrix instantiation is likely incompatible with the pandas development version (i.e. future pandas 3.0.0). To check this, you need to first create an environment with formulaic and the pandas development version:

micromamba create --name pandas_dev240427 python=3.11 formulaic=1.0.1 ipykernel
micromamba activate pandas_dev240427
micromamba remove -y --force pandas
pip install --pre --extra-index --upgrade -i "https://pypi.anaconda.org/scientific-python-nightly-wheels/simple" pandas

The pandas version should look something like this:

pandas.__version__  # '3.0.0.dev0+807.ga1fc8e8147'

Running the example from the README

import pandas
from formulaic import Formula

df = pandas.DataFrame({
    'y': [0,1,2],
    'x': ['A', 'B', 'C'],
    'z': [0.3, 0.1, 0.2],
})

y, X = Formula('y ~ x + z').get_model_matrix(df)

raises FormulaMaterializerNotFoundError: No materializer has been registered for input type 'pandas.DataFrame'. Available input types are: set()..

This first showed up here.

matthewwardrop commented 2 months ago

Thanks @MatthiasSchmidtblaicherQC ! This has been fixed in the main branch, but I haven't released it yet because I figured it was a pre-release and few people would use it. Sounds like I should push it out!

MatthiasSchmidtblaicherQC commented 2 months ago

I am glad to hear that it has been fixed. From my side, there is no need to expedite a release.