Open vnlitvinov opened 3 years ago
cc @YarShev
ray.get() allows to deserialize data with zero-copy for primitive data types (if an object supports pickle protocol 5). Then, we intentionally make a copy of data so the user can mutate it so I don't think we can do anything on this matter to speed up to_pandas. I think we can close the issue. @anmyachev, @dchigarev, thoughts?
Since Ray uses pickle protocol 5 it should be possible to deserialize partitions in-place when cooking the joined dataframe instead of copying them around.