Closed martinRenou closed 8 months ago
I have tried this locally and I see the same dramatic speed improvements. It would be good to continue with this as it will be a good basis for experiments in filtering and sorting on the backend that I'd like to look at.
I have been working just this week to better understand binary serialization from pandas through ipywidgets to js. I think I'm going to use arrow-js. I'm hoping to publish a very rough early repo later today.
I'm currently fleshing out a simple IPYWidget library that lets me prototype simple examples, and it will be easier to collaborate with other people since it's a simple library.
Trevor Manz and Kyle Barron have been doing work in this space too.
I'd love to collaborate with others on this.
FWIW I just pushed the first commits to the serialization playground df_cereal https://github.com/paddymul/df_cereal
I have examples of arrow-js serialization working entirely in js. I currently can't get the python side to work to communicate bytes or base64 to JS
Benchmarks and more docs coming soon.
BTW I looked at what bqplot is doing. I suspect arrow based serialization will be much faster since it doesn't deal with json at all.
Thank you for reaching out @paddymul. This looks interesting!
will be much faster
I'm a tiny bit skeptical about this. The JSON message bqplot sends is minimal in the end.
I feel like we should go ahead with this PR once it's passing all tests. Then I'm 💯 to continue discussing on having a common place for having better binary serialization that we can use across widgets. I don't like depending on bqplot for this, but it was already a dependency for some reason (probably some legacy dependency due to removed code) so it's convenient to just use it for now.
Make ipydatagrid more performant, achieving two things:
What's remaining to make the PR ready to review:
In follow up PRs, the next items should be resolved:
_visible_rows
attribute