xorbitsai / xorbits

Scalable Python DS & ML, in an API compatible & lightning fast way.
https://xorbits.io
Apache License 2.0
1.06k stars 67 forks source link

FEAT: how xorbits datastes export to json file #761

Open simplew2011 opened 6 months ago

simplew2011 commented 6 months ago

Is your feature request related to a problem? Please describe

Describe the solution you'd like

simplew2011 commented 6 months ago

需要实现如下接口 xorbits.datasets.to_huggingface xorbits.datasets.Dataset.from_dataframe xorbits.datasets.export_json

simplew2011 commented 6 months ago

需要向dataset.Dataset中新增一列用于记录中间值,如何处理,只看到getitem,没有实现setitem

simplew2011 commented 6 months ago

dataset.Dataset如何进行过滤,

类似于huggingface.dataset:https://github.com/huggingface/datasets/blob/ef0f986518bd252c5314a7e3a419dedcbb166630/src/datasets/arrow_dataset.py#L5061

qinxuye commented 6 months ago

@codingl2k1 看下这个问题。

@simplew2011 你有兴趣来贡献吗?

codingl2k1 commented 6 months ago

dataset.Dataset如何进行过滤,

类似于huggingface.dataset:https://github.com/huggingface/datasets/blob/ef0f986518bd252c5314a7e3a419dedcbb166630/src/datasets/arrow_dataset.py#L5061

Currently, xorbits dataframe can export the dataframe to csv, parquet, sql, and dataframe apply may be able to meet your needs. xorbits dataset can map data and convert the dataset to dataframe, but the filter is not implemented.

Could you provide some example code?