import mars
mars.new_session(backend='ray')
import mars.tensor as mt
import mars.dataframe as md
df = md.DataFrame(
mt.random.rand(1000_0000, 4),
columns=list('abcd'))
# Convert mars dataframe to ray dataset
ds = md.to_ray_dataset(df)
print(ds.schema(), ds.count())
ds.filter(lambda row: row["a"] > 0.5).show(5)
# Convert ray dataset to mars dataframe
df2 = md.read_ray_dataset(ds)
print(df2.head(5).execute())
output
Traceback (most recent call last):
File "/home/admin/Work/mars/mars/dataframe/datasource/read_raydataset.py", line 120, in read_ray_dataset
from ray.data.impl.pandas_block import PandasBlockSchema
ModuleNotFoundError: No module named 'ray.data.impl'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/admin/Work/mars/t1.py", line 15, in <module>
df2 = md.read_ray_dataset(ds)
File "/home/admin/Work/mars/mars/dataframe/datasource/read_raydataset.py", line 129, in read_ray_dataset
dtypes = schema.empty_table().to_pandas().dtypes
AttributeError: 'PandasBlockSchema' object has no attribute 'empty_table'
A clear and concise description of what the bug is.
To Reproduce
To help us reproducing this bug, please provide information below:
Your Python version
The version of Mars you use
Versions of crucial packages, such as numpy, scipy and pandas Ray == 2.1.0
Full stack of the error.
Minimized code to reproduce the error.
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
Describe the bug
output
A clear and concise description of what the bug is.
To Reproduce To help us reproducing this bug, please provide information below:
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.