modin-project / modin

Modin: Scale your Pandas workflows by changing a single line of code
http://modin.readthedocs.io
Apache License 2.0
9.59k stars 647 forks source link

FEAT-#6574: UserWarning no longer displayed when Series/DataFrames are small #7323

Closed Jayson729 closed 1 week ago

Jayson729 commented 1 week ago

What do these changes do?

When creating a DataFrame or Series, no longer display UserWarning: Distributing <class 'NoneType'> object. This may take some time. when the size of the DataFrame or Series is small.

Jayson729 commented 1 week ago

Apparently this fails this test https://github.com/modin-project/modin/blob/main/modin/tests/pandas/test_series.py#L3430-L3439 because there is no longer a UserWarning being given in this case. I think changing this test to the following would solve this and still test the same thing, although we will have to import warnings since that is not included in this file. Does this seem right?

Relevant pull request

def test_6782():
    datetime_scalar = datetime.datetime(1970, 1, 1, 0, 0)
    match = "Adding/subtracting object-dtype array to DatetimeArray not vectorized"
    with warnings.catch_warnings():
        warnings.filterwarnings("error", match, UserWarning)
        pd.Series([datetime.datetime(2000, 1, 1)]) - datetime_scalar
anmyachev commented 1 week ago

Does this seem right?

@Jayson729 yes, it looks great! Only the category should be different (PerformanceWarning): https://github.com/pandas-dev/pandas/blob/c46fb76afaf98153b9eef97fc9bbe9077229e7cd/pandas/core/arrays/datetimelike.py#L1316

Note: It looks like 6782 test no longer worked as originally planned since pandas changed the warning category.