PythonPredictions / cobra

A Python package to build predictive linear and logistic regression models focused on performance and interpretation
https://pythonpredictions.github.io/cobra.io
MIT License
30 stars 6 forks source link

Short term pandas 2.0 support: avoid pandas==2.0 installation. #159

Closed sandervh14 closed 1 year ago

sandervh14 commented 1 year ago

Task Title

Task: Short term pandas 2.0 support: avoid pandas==2.0 installation.

Task Description

On installing a fresh conda environment for a cobra experiment just now, I got this error during pandas preprocessing:

File ~/.conda/envs/cobra/lib/python3.9/site-packages/cobra/preprocessing/kbins_discretizer.py:330, in KBinsDiscretizer._transform_column(self, data, column_name, bins)
    324 data.loc[:, column_name_bin] = (data[column_name_bin]
    325                                 .cat.rename_categories(bin_labels))
    327 if data[column_name_bin].isnull().sum() > 0:
    328 
    329     # Add an additional bin for missing values
--> 330     data[column_name_bin].cat.add_categories(["Missing"], inplace=True)
    332     # Replace NULL with "Missing"
    333     # Otherwise these will be ignored in groupby
    334     data[column_name_bin].fillna("Missing", inplace=True)

(...)

TypeError: add_categories() got an unexpected keyword argument 'inplace'

Pandas 2.0 removed inplace arguments: https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html

Add in the requirements.txt and readme.md a <=2.0.0 clarification for short term support, to avoid many cobra users struggling.

sandervh14 commented 1 year ago

This issue is for short term support. We can integrate pandas 2.0 on a longer term on issue #160.

ZlaTanskY commented 1 year ago

If this also inflicts cobra users who install by using pip install pythonpredictions-cobra, then the pandas version should be limited in setup.py as well. requirements.txt and conda_env.yml will only be used by cobra developers