larray-project / larray

N-dimensional labelled arrays in Python
https://larray.readthedocs.io/
GNU General Public License v3.0
8 stars 6 forks source link

Test performance of PyTables 3.8+ #1044

Open gdementen opened 1 year ago

gdementen commented 1 year ago

I do not have super high hopes though, as only the Table object was optimized, which we don't use most of the time. Pandas uses the "fixed" format by default, which was faster than the "table" format. The question is whether the new performance optimization makes the "table" format faster than the "fixed" format. If we keep loading the full array, I doubt it. And I am sure, it will not beat the performance I got from my feather experiments (#1016). But using the "table" format could be useful for lazy arrays after we have implemented lazy sessions (#727).

https://www.blosc.org/posts/blosc2-pytables-perf/