kpertsch / rlds_dataset_builder

An example RLDS dataset builder for X-embodiment dataset conversion.
MIT License
33 stars 71 forks source link

Parallel Mode: AssertionError: No examples were yielded. #5

Open emfebert opened 1 month ago

emfebert commented 1 month ago

Hi,

I'd like to run dataset generation using beam and I'm getting the following error:

shard_boundaries = _get_shard_boundaries(num_examples, num_shards)

File "/home/febert/anaconda3/envs/rlds_env/lib/python3.9/site-packages/tensorflow_datasets/core/writer.py", line 142, in _get_shard_boundaries raise AssertionError("No examples were yielded.") AssertionError: No examples were yielded.

Dataset generation in a single process works fine.

kpertsch commented 1 month ago

For multi-threaded generation, please use the multithreading branch of the repo -- you can just copy your logic for parsing an example into the dataset generator there and control the level of parallelism with the N_WORKERS variable: https://github.com/kpertsch/rlds_dataset_builder/blob/d38f6d37a2572943fe59a279d53e63547e270aa0/example_dataset/example_dataset_dataset_builder.py#L68