modin-project / modin

Modin: Scale your Pandas workflows by changing a single line of code
http://modin.readthedocs.io
Apache License 2.0
9.81k stars 651 forks source link

modin with ray engine hang #7349

Open cometta opened 2 months ago

cometta commented 2 months ago

my code hang , can advice what i miss out?

modin 0.31.0 modin-spreadsheet 0.1.2 ray 2.32.0

import argparse
import modin.pandas as pd
import os

os.environ["MODIN_ENGINE"] = "ray"  
os.environ["RAY_memory_monitor_refresh_ms"] = "0"
args = parser.parse_args()
print("1")
df = pd.read_parquet(args.path) # hang at this line
print("2")

output

1
2024-07-22 15:04:45,809 INFO worker.py:1788 -- Started a local Ray instance.
2024-07-22 15:04:45,809 INFO worker.py:1788 -- Started a local Ray instance.
(raylet) [2024-07-22 15:06:45,793 E 1438 1438] (raylet) node_manager.cc:3064: 1 Workers (tasks / actors) killed due to memory pressure (OOM), 0 Workers crashed due to other reasons at node (ID: 82849b5e133875b55fbd974d5392b702c2705b5dc16c6c8ca24aaead, IP: x.x.x.x) over the last time period. To see more information about the Workers killed on this node, use `ray logs raylet.out -ip x.x.x.x`
(raylet)
(raylet) Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero.

extra info, i see memory usage is maxed out. 64GB. this might be related and caused the hang. any option i need to set to modin ?

devin-petersohn commented 2 months ago

Hi @cometta , welcome!

Are you on MacOS by chance? Object spilling is handed a little differently on MacOS.

cometta commented 2 months ago

@devin-petersohn , i tested on intel VM, run on Ubuntu

devin-petersohn commented 2 months ago

Ok great, in this case it should be straightforward to just use MODIN_MEMORY environment variable or config. Link to how you can use these is here: https://modin.readthedocs.io/en/stable/flow/modin/config.html#modin-configs-list

Let me know if that helps!

cometta commented 2 months ago

i tried os.environ["MODIN_MEMORY"] = "30000000000" but when i monitor memory , the usage is still use more than 64GB and then i get ray.exceptions.WorkerCrashedError: The worker died unexpectedly while executing this task. Check python-core-worker-*.log files for more information..does Modin use so much memory to load parquet files?

YarShev commented 2 months ago

@cometta, take a look at this comment https://github.com/modin-project/modin/issues/7020#issuecomment-2003264663.