MarcoGorelli / polars-upgrade

Automatically upgrade your Polars code to use the latest syntax available
MIT License
61 stars 1 forks source link

groupby ==> group_by mapping #16

Open ghuls opened 5 months ago

ghuls commented 5 months ago

groupby ==> group_by mapping is missing:

diff --git a/polars_upgrade/_plugins/renamed_dataframe_methods.py b/polars_upgrade/_plugins/renamed_dataframe_methods.py
index 3a3dfd3..3328a60 100644
--- a/polars_upgrade/_plugins/renamed_dataframe_methods.py
+++ b/polars_upgrade/_plugins/renamed_dataframe_methods.py
@@ -26,6 +26,7 @@ def rename(

 RENAMINGS = {
+    "groupby": ((0, 19, 0), "group_by"),
     "groupby_dynamic": ((0, 19, 0), "group_by_dynamic"),
     "groupby_rolling": ((0, 19, 0), "rolling"),
 }

The only problem with this patch is that it causes the pandas test to fail:

======================================================================== FAILURES =========================================================================
___________________________________________________________________ test_library_pandas ___________________________________________________________________

    def test_library_pandas() -> None:
        src = """\
    df = (pd.concat(my_list).groupby('ITEM_ID').apply(lambda x: np.any(x['VALUE'].notnull())).\
    reset_index().rename(columns={0: 'HAS_DATA'}))
    """
        settings = Settings(target_version=(0, 20, 10))
        result = rewrite(src, settings=settings, aliases={})
>       assert result == src
E       assert "df = (pd.con...AS_DATA'}))\n" == "df = (pd.con...AS_DATA'}))\n"
E         
E         - df = (pd.concat(my_list).groupby('ITEM_ID').apply(lambda x: np.any(x['VALUE'].notnull())).reset_index().rename(columns={0: 'HAS_DATA'}))
E         + df = (pd.concat(my_list).group_by('ITEM_ID').apply(lambda x: np.any(x['VALUE'].notnull())).reset_index().rename(columns={0: 'HAS_DATA'}))
E         ?                               +

tests/library_test.py:61: AssertionError
================================================================= short test summary info =================================================================
FAILED tests/library_test.py::test_library_pandas - assert "df = (pd.con...AS_DATA'}))\n" == "df = (pd.con...AS_DATA'}))\n"
============================================================== 1 failed, 79 passed in 0.21s ===============================================================
MarcoGorelli commented 5 months ago

hey - yeah, I left that one out as it can easily be confused with the pandas one

🤔 maybe we could enable some riskier ones behind a --no-pandas flag or something like that?

ghuls commented 5 months ago

Yeah, an option to enable this behavior (or print some warnings about groupby would be great.