from datasets import load_dataset
ds = load_dataset("rotten_tomatoes", split="train", streaming=True)
ds = ds.with_format("arrow").map(lambda x: x)
for ex in ds:
pass
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Fixes the bug when applying map to an arrow-formatted iterable dataset described here:
https://github.com/huggingface/datasets/issues/6833#issuecomment-2399903885
@lhoestq