ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.94k stars 5.58k forks source link

[AIR] SettingWithCopyWarning for "A value is trying to be set on a copy of a slice from a DataFrame" #27352

Open jiaodong opened 2 years ago

jiaodong commented 2 years ago

What happened + What you expected to happen

On commit https://github.com/ray-project/ray/commit/a598458c464b88535e711ef7ef55f88e25c1820f

when we execute Step 5) https://docs.ray.io/en/master/ray-air/examples/torch_incremental_learning.html#step-5-putting-it-all-together it shows up.

Likely because there're places we assigned value of a pandas DF internally or notebook that didn't use loc syntax.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

From the linked docs looks like if we ensure we did it right, we also get perf wins: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

" This allows pandas to deal with this as a single entity. Furthermore this order of operations can be significantly faster, and allows one to index both axes if so desired. "

Versions / Dependencies

a598458c464b88535e711ef7ef55f88e25c1820f

Reproduction script

above

Issue Severity

Low: It annoys or frustrates me.

matthewdeng commented 2 years ago

@jiaodong do you know where in the code this is getting emitted? Is this in predictor or datasets or user code?