radio-astro-tools / casa-formats-io

Code to handle I/O from/to data in CASA format
Other
10 stars 7 forks source link

Speed up Table.__repr__ and potentially other forms of data access #38

Closed astrofrog closed 2 years ago

astrofrog commented 2 years ago

This speeds up the Table.__repr__ by avoiding repeated calls to CasaArrayWrapper.__getitem__ with the same arguments (which is not actually cached by dask, see https://github.com/dask/dask/issues/8420). For the particular example I was trying this speeds things up from 20 seconds to 5 seconds. To speed this up further, I will need to try and profile the dask data access in detail, but at least this was a low hanging fruit improvement, and 5 seconds is not too bad considering the size of the tables we are accessing.

This goes towards addressing https://github.com/radio-astro-tools/casa-formats-io/issues/36

codecov-commenter commented 2 years ago

Codecov Report

Merging #38 (c5a5739) into main (95403b8) will decrease coverage by 0.05%. The diff coverage is 22.22%.

:exclamation: Current head c5a5739 differs from pull request most recent head b7112db. Consider uploading reports for the commit b7112db to get more accurate results Impacted file tree graph

@@            Coverage Diff             @@
##             main      #38      +/-   ##
==========================================
- Coverage   44.94%   44.89%   -0.06%     
==========================================
  Files          17       17              
  Lines        2118     2125       +7     
==========================================
+ Hits          952      954       +2     
- Misses       1166     1171       +5     
Impacted Files Coverage Δ
casa_formats_io/casa_dask.py 22.30% <22.22%> (+0.35%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 95403b8...b7112db. Read the comment docs.

astrofrog commented 2 years ago

Since this is reasonably straightforward and to make it easier to try out, I'll go ahead and merge.