huggingface / dataset-viewer

Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
https://huggingface.co/docs/datasets-server
Apache License 2.0
660 stars 68 forks source link

ComputationError (ZeroDivisionError) on split-descriptive-statistics #2631

Open severo opened 4 months ago

severo commented 4 months ago

For mozilla-foundation/common_voice_6_1 / dv / other, we currently have error ComputationError due to ZeroDivisionError. cc @polinaeterna

polinaeterna commented 4 months ago

it seems to be a parquet file with 0 rows

polinaeterna commented 4 months ago

...so it divides by zero to calculate proportion lol. not sure what kind of response should be there. this seems to be an extremely rare case

severo commented 4 months ago

OK. Anyway, I think we should handle the case, for example returning "null"/"None" if 0 rows

polinaeterna commented 4 months ago

I think we should handle the case

Agree, should we do changes in other steps too? Like it doesn't make sense to create parquet files or duckdb index files in such cases I think

severo commented 4 months ago

Yes, you're right, maybe we want to handle the "no rows" case in a global PR