Open Liquidmasl opened 1 month ago
Hi @Liquidmasl
Modin has these default values because it helps to achieve good performance in general. If you have a specific case and Modin's configuration variables don't help you, you can initialize ray yourself.
I see. I understand my experience does not stand by any means for everyone. But with these defaults I had numerous bluescreens, freezes and crashes. All in all making debugging and figuring this out a lot more troublesome then necessary.
I did not want to initialize ray myself for the exact cause that I thought modin will know best, but it did give me no option to just adapt the two values that lead to issues for me (_memory
and include_dashboard
)
if you think the current defaults work fine most of the time and my situation is an outlier, fair enough!
I still think introducing config params or env vars that give the option to set _memory
, object_store_memory
and include_dashboard
manually while still relying on modins ray initialisation would be good.
As I understood its a relatively new feature of modin that it initialises ray itself. So maybe there will be some changes along the way anyway. For now, now that I understand that, its fine to initialize ray manually
Modin version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest released version of Modin.
[ ] I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)
Reproducible Example
Issue Description
modin sets
_memory
andobject_store_memory
to the same value. This not only leads to instability and crashes, but it also reduces the flexibility as _memory can be set to a value higher then the shared memory while object_store_memory cannot.A lot of the issues I faced the last few days with read_parquet() (althrough, this still fills up RAM until my pc crashes), to_parquet(), concat(), etc etc stemmed from the issue that when the object store was full and a spill was attempted, a write violation happend, and a raylet died.
I noticed that modin runs a lot more stable when ray.init() was called manually. This is because there the two values are not set to the same value per default.
Also, it would be great if the ray dashboard was not disabled per default, without being able to enable it when initialising with modin. But I digress.
Expected Behavior
If no manual configuration was done, or env variables where set, the default ray init should be used. And if not default, then not something this debilitating.
After initializing ray manually and just setting
_memory
to something way larger, stuff just started working. While setting MODIN_MEMORY to something higher when using modins initialisation did not work, because it lead to a value error from RAY stating thatobject_store_memory
cant be set that high (even though I did never care about theobject_store_memory
.Error Logs
Installed Versions