mwouts / itables

Pandas DataFrames as Interactive DataTables
https://mwouts.github.io/itables/
MIT License
767 stars 56 forks source link

Conditional Filtering #231

Closed on46zohu closed 6 months ago

on46zohu commented 7 months ago

As far as I see, it is only possible to search a specific value to filter the table, right?

Or can I somehow filter a column such as > 1 ?

mwouts commented 7 months ago

Hi @on46zohu , right now the search options only search through text. Still, you can have search fields per columns like in this example.

Outside of itables (e.g. not working today but if you know enough Javascript there might be something to do), I see this example in the datatables.net documentation: image

I am also aware of the Column Filters in DT that are very close to what I imagine you're looking for (but I have no idea how they do that!): image

mwouts commented 6 months ago

Hi @on46zohu , now in itables==2.0 we have the search builder extension: https://mwouts.github.io/itables/extensions.html#searchbuilder

Would you like to give it a try?

on46zohu commented 6 months ago

Hi, I'm sorry that could not response your earlier response.

I will try this one asap, and review!

on46zohu commented 6 months ago

I tried the new functionality, and it is exactly what I needed!

I don't know if I should create another issue, but 2 points I've noticed:

Anyway, what I've observed (at least with our large datasets) is that show() runs too slow and eventually crash when running with DataFrames larger than 20M cells (which is equivalent to ~15MB). Is there any option to cope with this performance issue rather than downsampling?

Thanks!

mwouts commented 6 months ago

Hey @on46zohu , thanks for the feedback! I am glad you like the extensions!

if you activate the new filtering buttons together with the "Copy" button and "keys=True" options, it throws an error. May be you want to check, and confirm.

Yes please open a new issue and provide an example, I will look into this.

For me, opt.maxBytes is not helping to understand how big my dataframe to view is, as the output of this does not change with varying size of tables (...)

I see. I can try to improve the down-sampling message and add the estimated size of the original table, and possibly also revisit the documentation of the down-sampling help page. Does it help if I say that maxBytes is the maximum amount of data that is passed to DataTables?

Anyway, what I've observed (at least with our large datasets) is that show() runs too slow and eventually crash when running with DataFrames larger than 20M cells (which is equivalent to ~15MB). Is there any option to cope with this performance issue rather than downsampling?

This is because all the data that is passed to DataTables is inserted (in JSON format) into the HTML document, and your browser cannot work with very large HTML documents. I'll see how I can improve the https://mwouts.github.io/itables/downsampling.html page and do a better job at explaining this. But in short, yes, there is a limitation on the size of the tables that you can render in full with ITables.

To remove that limitation, we would have to use the server-side processing capability of DataTables, and let DataTables query only the data that is being displayed. However that is neither implemented nor planned at the moment, and it would come with limitations e.g. a server is required at all times.