CARTAvis / carta-backend

Source code repository for the backend component of CARTA, a new visualization tool designed for the ALMA, the VLA and the SKA pathfinders.
https://cartavis.github.io/
GNU General Public License v3.0
22 stars 10 forks source link

Fix hdf5 tile data after animation stops #1371

Closed pford closed 1 month ago

pford commented 2 months ago

Description

Checklist

pford commented 2 months ago

Not sure if this should be code reviewed or tested first since it involves copying data, which we try to avoid.

confluence commented 2 months ago

In this case I think copying is unavoidable -- the compression function should not be modifying data in the tile cache!

I do remember adding a tile pool to the tile cache to avoid constantly creating and destroying vector objects, since this was causing quite a significant slowdown. I wonder if a tile pool would be appropriate here as well (I think that we should test the performance impact of this change first).

If yes, then we could either create a new tile pool to be used by this function, or add a GetCopy function to the tile cache which performs the copy internally and uses the tile cache's own tile pool (we may then need to adjust the tile cache pool size when we start and stop the animation, but maybe the tile requirements aren't going to exceed the default size, which IIRC is enough to fit one row and one column of 2x2 tile chunks).

confluence commented 2 months ago

Actually, now I see that the current change is in FillRasterTileData, which means that the copy is being performed regardless of how the data was obtained in GetRasterTileData (which means that it will also affect cases where the data does not need to be copied). It's GetRasterTileData which should be modified -- there's already a new vector which is created there (for the cases which don't use the tile cache), which could be used to copy the cached data.

Perhaps adding a new tile pool in GetRasterTileData, and using it in all cases instead of creating a new tile vector, would be a good idea.

pford commented 1 month ago

@confluence I added TileCache::GetCopy. If I added a TilePool to Frame, wouldn't it have the same problem as the TileCache pool?

confluence commented 1 month ago

@confluence I added TileCache::GetCopy.

Currently GetCopy is creating a new vector. I was originally suggesting using the tile cache's pool to recycle a vector object to use for the copy (i.e. using _pool->Pull() to get a vector pointer).

If I added a TilePool to Frame, wouldn't it have the same problem as the TileCache pool?

I'm not sure what problem you mean here.

I think I've confused the issue by conflating multiple suggestions -- for clarity, I'm talking about two separate things:

  1. In GetRasterTileData we already create a new vector to read data into, in the cases that don't use the tile cache. I would suggest that we use this vector as well in the tile cache case, for consistency, instead of creating a vector in a different place. Perhaps the GetCopy function should be called CopyInto and take that vector as an input/output parameter -- but maybe this function isn't necessary at all, and the copy can just be inlined in GetRasterTileData. (I originally suggested the tile cache function to enable the use of the tile cache's pool, before I realised that the tile cache case could be integrated more generically with the other cases.)

  2. Additionally, as a performance improvement, I think it would be a good idea to use a tile pool for that new vector in GetRasterTileData, instead of repeatedly creating and destroying it. My updated suggestion for this is to add a separate tile pool object to the frame, to be used here in all the cases, not just the tile cache case. I haven't checked if there are other places in the frame where a new tile vector is created (which could make use of the same pool).

confluence commented 1 month ago

@pford I was suggesting to use only a tile pool in the frame for all the tiles, not the tile cache -- we only added the full-resolution tile cache to HDF5 files because in the HDF5 schema the main dataset is chunked, which makes it efficient to read a chunk (which is 2x2 tiles) at a time. For FITS files and other formats, reading a chunk at a time is not efficient, which is why we still use the full channel cache for those formats. I don't think it makes sense to copy tiles read from the full channel cache into the tile cache (this is currently bypassing the tile cache's tile pool, and I believe that downsampled tiles are also being saved -- this cache is only written to handle full-resolution tiles).

I am suggesting only that the frame gets its own tile pool object (separate to the tile cache's tile pool), and uses that pool to recycle the tile pointers (instead of creating new pointers with newly allocated data). This wouldn't be caching any tile data; just keeping a buffer of reusable objects allocated in memory. They would be overwritten with new data every time they are used (so it would be fine to use them for all tiles in GetRasterTileData, whether they're downsampled or not).

So the frame would have a std::shared_ptr<TilePool> _pool, with some reasonable capacity (how many tiles are likely to be allocated at the same time during an animation?), and in GetRasterTileData instead of creating a new tile_data vector and later a new pointer with make_shared, we'd get a pointer from the pool with _pool->Pull(), and write to that pointer's data. The pool class handles the recycling internally (with a custom deleter).

So it would be just this small change, plus the copying in the tile cache case.

pford commented 1 month ago

I added a TilePool to Frame with the capacity set to the number of omp threads, since Session fills the tiles in parallel using omp. The TilePool is created the first time tiles are requested, since tiles are not used when Frame holds the downsampled image for PV preview.

github-actions[bot] commented 1 month ago

Code Coverage

Package Line Rate Health
src.Cache 72%
src.DataStream 44%
src.FileList 67%
src.Frame 36%
src.HttpServer 42%
src.ImageData 28%
src.ImageFitter 83%
src.ImageGenerators 43%
src.ImageStats 75%
src.Logger 37%
src.Main 52%
src.Region 69%
src.Session 4%
src.Table 52%
src.ThreadingManager 67%
src.Timer 85%
src.Util 40%
Summary 46% (8614 / 18797)
confluence commented 1 month ago

@kswang1029 in addition to testing the bug fix, please check the performance impact for animation (of both HDF5 files and other file types). I expect this to slightly decrease performance for HDF5 files but also slightly increase performance across the board, and I'd like to find out what the overall impact is (and to make sure that something unexpected isn't happening).

confluence commented 1 month ago

@kswang1029 @pford in that case I'm happy to merge this.

kswang1029 commented 1 month ago

@kswang1029 @pford in that case I'm happy to merge this.

the effect (if any) may be more prominent using "large" cube but I do not have the resources to test with, unfortunately.