OHIF / Viewers

OHIF zero-footprint DICOM viewer and oncology specific Lesion Tracker, plus shared extension packages
https://docs.ohif.org/
MIT License
3.12k stars 3.29k forks source link

v3 stable (extensions/cornerstone): Support for Extremely Large Volumes and 4D #3082

Open Ouwen opened 1 year ago

Ouwen commented 1 year ago

Currently, the Volume Viewport takes in a singleton array volumeId which can contain several imageIds. These imageIds are currently all loaded into webGL which causes OOM. In mobile cases there is less memory and the problem is more pronounced.

Extremely large volumes need to be broken into smaller volume chunks. For example a 1000 slice volume can be broken into 3 chunks: [0 - 500], [250-750], and [500-1000]. We can track the user's scroll position along the primary data axis (0th axis) and set the volume as needed. For typical CTs the primary data axis is usually axial.

One limitation of this approach is the chunking of the Sagittal and Coronal views which are common--as well as arbitrary volume slicing which is less common. These views would likely need to be constructed as separate series on the server side. Or they would be constructed piecemeal and stored on disk either via IndexedDB or FilesystemAPI on the client side.

The ViewportService/CornerstoneViewportService would be modified to include an option for volume partitioning. These options would include something like the following:

maxNumSlices: 100, // the max number of slices to load simultaneously
numOverlapSlices: 50, // overlap desired (i.e. this should be set to the max MIP desired for continuous volumes and 0 for temporal volumes)
disableOrientation: true // enable or disable changing orientations (axial, coronal, sagittal)

A new service would be created to house the logic for these options: ViewportService/CornerstoneVolumePartitionService

A new event would be added to cornerstone3D beta for scroll events:

https://github.com/cornerstonejs/cornerstone3D-beta/blob/d58c9fbe3f8bfc09aa9ece28f8ae1ca1141c2c7e/packages/tools/src/utilities/scroll.ts#L57-L61

The manager would set the active volume based on these events. The SliceRange variable contains min, max, and current. If the current slice equals min + numOverlapSlices or max - numOverlapSlices the corresponding partitioned volumeID would be set.

Since caching occurs at a imageId level, the CPU memory cache should allow for setVolume to be performant. The only additional operation is copying the CPU memory into GPU memory. In the case some imageIds are decached due to CPU memory constraints they would be progressively loaded in over network.

This implementation could be used in 4D imaging with maxNumSlices set to the size of each 3D volume and numOverlapSlices set to 0. In this case the sagittal and coronal views would be accurate and disableOrientation should be set to false. The CornerstoneVolumePartitionService would expose a nextVolumePartition and previousVolumePartition function to move forward or background in time.

Ouwen commented 1 year ago

@sedghi the above is based on our conversation yesterday, if this seems reasonable to you I can start a POC implementation.

ranasrule commented 1 year ago

would this also resolve this issue??? >>> https://github.com/cornerstonejs/cornerstone3D-beta/issues/320

Ouwen commented 1 year ago

Hmm for MPR it will not quite resolve your issue as you would only load partial volumes. You can increase the cache size in the cornerstone extension init. However, your browser will limit you to a cache size of likely 4GB. To load extremely large MPR volumes you'd likely need to save series generated on the fly to disk or load pre generated series from network.

ranasrule commented 1 year ago

Hmm for MPR it will not quite resolve your issue as you would only load partial volumes. You can increase the cache size in the cornerstone extension init. However, your browser will limit you to a cache size of likely 4GB. To load extremely large MPR volumes you'd likely need to save series generated on the fly to disk or load pre generated series from network.

Thanks for the reply. What is the cache size by default? Also how can I change it to the maximum?

Ouwen commented 1 year ago

https://github.com/cornerstonejs/cornerstone3D-beta/blob/dc2366e564fae121d5bcd4856f2c8afedf71f2a6/packages/core/src/cache/cache.ts#L41-L48

You can call the above to set the cache size

ranasrule commented 1 year ago

https://github.com/cornerstonejs/cornerstone3D-beta/blob/dc2366e564fae121d5bcd4856f2c8afedf71f2a6/packages/core/src/cache/cache.ts#L41-L48

You can call the above to set the cache size

I have very little experience with JS....how would I go about changing the cache size in OHIF viewer V3? I havent been able to find where this code is the OHIF Viewer V3 stable branch code. Could you help me out?

fedorov commented 1 year ago

We have some very large reconstructed cryosection images in IDC for the Visible Human dataset, and current caching approach of both OHIF v2 and v3 appear to struggle with this kind of data.

It seems that the caching strategy should take into account the total memory size available, and the size of a single frame, and not keep fetching frames of the volume when it is impossible to fetch all of the frames in the cache. Those cryosection volumes we have can exceed 100 GB in size.

@sedghi have there been any discussions to address this use case?

cc: @igoroctaviano

Ouwen commented 1 year ago

The main decision is which dimensions to subsample given a finite amount of memory. I think some reasonable choices are subsampling [x,y] in the plane of acquisition or subsampling z-depth.

Tilization can also be used if [x,y] is extremely high resolution.

sedghi commented 1 year ago

@fedorov how big is each frame/slice?

fedorov commented 1 year ago
SELECT
  ROUND(MAX(instance_size)/POW(1024,2),2) AS max_instance_size_MB,
  ROUND(SUM(instance_size)/POW(1024,3),2) AS series_size_GB,
  ARRAY_TO_STRING(ANY_VALUE(PixelSpacing),"/") AS PixelSpacing
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  Modality = "XC"
GROUP BY
  SeriesInstanceUID
ORDER BY
  max_instance_size_MB desc

image

To learn how to write and run queries such as above, see this IDC tutorial: https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part2_searching_basics.ipynb.

sedghi commented 1 year ago

I'm really surprised you tried on v3 and it was not smooth. Do you have a deployed link to that study/series?

fedorov commented 1 year ago

I shared the link via slack. Can you briefly describe how caching in v3 is different from v2 to expect smooth scrolling?

sedghi commented 1 year ago

So i tried the link and this is what is happening. It is very interesting and funny problem kind of

Re first render aka metadata call

image

Re image loading slow ness

The current cache for cornerstone is set at 1GB which is around 200 images of 4.5MB each. And what happens is that we are sending out 25 simultaneous request (at max) and that cache gets full in ~8 seconds, so from that time we are overwriting the old cached images by the new ones that are returned. Not only that when you move to a slice say slice 20 we have "smart (meh)" prefetching which grabs the before and after and guess what those images (18, 19, 21, 22 etc.) are not in the cache either so it is kind of a recursive calls of images ovverriding prev cached images.

and also each image itself has 4 seconds for the server to respond too

image

So sum up our caching and your server issues and you see that bad viewer experience. You can certainly make your server faster but for caching I'm not sure what else we can do. Seems like the caching needs to know the current state of the viewer (viewing slice 20, so don't let 10-30 to be overwritten).

fedorov commented 1 year ago

If your cache is capped at 1GB, the user did not touch the slider, and the total series will never fit into cache - what is the in keeping fetching the slices and overwriting existing cache?

image

Am I correct in the above? My expectation would be that if the user does not touch the slider, it does not make any sense to cache beyond whatever frames around the current slider position that fit within the cache limit.

sedghi commented 1 year ago

it might not makes sense in this use case but for a regular size series any series which is less than 1GB in size totally it is highly reasonable to cache the whole series when the displayset mounts and make it available so that when the user moves the slider or scrolls, all images are available and there should be no network request sent. But I understand that it might not be really meaningful for this study. It might be smarter if we have a cache that takes into account the current state of the app (which imageId is being viewed) but it requires further considerations before implementation as caching happens at the library level not app

fedorov commented 1 year ago

But I understand that it might not be really meaningful for this study. It might be smarter if we have a cache that takes into account the current state of the app (which imageId is being viewed) but it requires further considerations before implementation as caching happens at the library level not app

Position-aware caching makes sense in a general use case, beyond this admittedly unusual study. Caching slices starting around the current position of the slider is probably a good idea in general to optimize user experience. I am pretty sure Nucleus viewer (the one that shows the content of cache as highlighted sections of the scroll bar) prioritizes caching around the active slice.

Also, even though cache is 1 GB, we should not forget about multi-modal studies where one can have multiple series in the layout, longitudinal studies, plus memory needs of segmentations and overlays. I think it is very likely that you will run into a situation where it will be impossible to cache the entire series for each of the series visualized much sooner than you thought, and position-aware caching should be able to help with that.

fedorov commented 1 year ago

Should we create a separate issue for position-aware caching? The proposal in the original report at the top I think suggests a different approach to deal with the same problem of handling large series.

sedghi commented 1 year ago

Proposal in the top is for MPR and is not related to our caching. We should create another issue

fedorov commented 1 year ago

We should create another issue

Done in https://github.com/OHIF/Viewers/issues/3578.

sedghi commented 11 months ago

PR in progress https://github.com/OHIF/Viewers/pull/3235

fedorov commented 8 months ago

The issue we were experiencing with Visible Human Dataset is significantly improved, probably due to the caching improvement, so I am removing the IDC label. The issue is resolved for us.