gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
32.42k stars 2.42k forks source link

Allow overriding the 5000 data rows limit for for altair visualizations #2797

Closed freddyaboulton closed 2 months ago

freddyaboulton commented 1 year ago

Is your feature request related to a problem? Please describe.
Altair will throw an error if the dataframe has more than 5000 rows: https://altair-viz.github.io/user_guide/faq.html#maxrowserror-how-can-i-plot-large-datasets

I think this limitation is reasonable as every row of the dataset gets mapped to a mark on the chart and visualizing more than 5000 points on a scatterplot, for example, will probably not look good anyways.

That being said, we should decide what should happen when someone inevitably passes more than 5000 rows to our visualization components.

Options include:

Describe the solution you'd like
A clear and concise description of what you want to happen.

Additional context
Add any other context or screenshots about the feature request here.

abidlabs commented 1 year ago

Similar to Altair itself, I think we should raise an error explaining that a very large webpage will be created but also give users a simple way to override if they know what they're doing

venkatavamsidama commented 12 months ago

@abidlabs
Handling the Altair limitation on the number of rows in a dataframe is indeed important. Given the options you've mentioned, I'd suggest incorporating a combination of both to provide flexibility to users. 1)Suppress Altair Error and Proceed 2)Raise Informative Error from Gradio

freddyaboulton commented 12 months ago

I think you may have mistyped @venkatavamsidama.You said the same thing twice 😅