ImagingDataCommons / slim

Interoperable web-based slide microscopy viewer and annotation tool
https://imagingdatacommons.github.io/slim/
Apache License 2.0
118 stars 36 forks source link

Out of memory error with Chrome and Edge based on Chromion on Windows 10 #124

Closed hackermd closed 2 years ago

hackermd commented 2 years ago

We have received reports of browsers running out of memory. The problem appears to be limited to the Chrome browser (or Edge based on Chromium) on Windows. I was not able to reproduce any of the errors with Chrome on MacOS.

It appears that the amount of allocated memory is different between operating systems (see Chrome docs).

A reason may be the tile cache. This can be configured via the tilesCacheSize option of the constructor of the VolumeImageViewer.

Potentially related to #107 and #67.

hackermd commented 2 years ago

@Punzo it appears that the issue is related to web workers. The problem appears to be restricted to Chromium-based browsers and has only been reported for Windows 10. Could you help debug the issue?

It appears unlikely that the issue is related to the tilesCacheSize. The app crashes immediately after the page is loaded at a time when only a few tiles have been loaded/cached.

It could have to do with web workers. It could be that too many workers are created, that each worker allocates too much memory, or that workers leak memory.

By default, the number of web workers is determined by navigator.hardwareConcurrency. In Chrome, the value can be changed via the devtools: https://developer.chrome.com/blog/new-in-devtools-104/#simulate. We could try setting that to 1 and see whether it solves the problem or prolongs the time before the app crashes.

It appears there is also a bug in Chromium that can result in memory leaks: https://bugs.chromium.org/p/chromium/issues/detail?id=1356332&q=worker%20memory&can=2

hackermd commented 2 years ago

Potentially relevant: https://stackoverflow.com/questions/35003676/web-worker-consumes-massive-amount-of-memory

Punzo commented 2 years ago

yes, web worker can use a lot of memory. For this reason they are generated dynamically when needed (but once instantiated, they are destroyed only when the page gets closed). The maximum number of webworker is defined here:

https://github.com/herrmannlab/dicom-microscopy-viewer/blob/master/src/webWorker/webWorkerManager.js#L14

i.e. navigator.hardwareConcurrency, which can lead to a very large number on some CPU and potentially overloading the memory.

Checking out OHIF-v3/cornerstone3D, for the same reason, the number of maximum workers has been set around 7, see https://github.com/cornerstonejs/cornerstone3D-beta/blob/a0eefdc34e171ac35375fe58c5da75575ed6d46e/utils/demo/helpers/initCornerstoneWADOImageLoader.js#L18

I advise to change to the same values. Please let me know if you would like me opening a PR or if it is easier for you to just change the values!

hackermd commented 2 years ago

Thanks @Punzo. I can make that change.

hackermd commented 2 years ago

@fedorov I wonder what may be a good way for us to test this in a (semi)automated way. Unfortunately, I don't always have access to a Windows computer and I imagine that hardware specifics may be difficult to test in virtual environments.

fedorov commented 2 years ago

Poojitha has access to a windows machine. If this is reproducible with a dataset in IDC, I can ask her to help - let me know.

hackermd commented 2 years ago

It would be great if Poojitha could help us replicate the issue, track down the underlying cause, and test whether a lower number of workers solves the problem. The number of workers can be set via the browser devtools as described here.

So far, we have encountered the issue only on computers with Windows 10 operating system and with Chrome or Edge browsers.

fedorov commented 2 years ago

@pgundluru can you help with this please? Markus can point you to some datasets in IDC where this issue is more likely to appear. I know you are busy with v11 testing right now, but maybe you can spend some time on this after that?

pgundluru commented 2 years ago

@fedorov Yes, I can help with reproducing this issue on W10 chrome and edge version. @hackermd Do you have access to the datasets that this was initially observed in so I can take a look?

hackermd commented 2 years ago

Thank you @pgundluru. That would be great.

I don't know exactly which images trigger the issue. One person encountered the issue using the following link: https://herrmannlab.github.io/slim/studies/2.25.18199272949575141157802058345697568861/series/1.3.6.1.4.1.5962.99.1.3510881361.982628633.1635598486609.2.0.

Punzo commented 2 years ago

I could reproduce it in windows, when the web worker manager instantiates the 8th/9th worker (index 7,8, the counting starts from index 0):

image P.S.: my CPU is an i9 so it has in default 20 threads.

By setting the hardware concurrency to 7.

image

I had no issue: image

Note that usually the default value for the hardware concurrency is the number of the CPU threads. So for the majority of the laptops and workstation usually they don't have more than 8 threads. Therefore limiting the value to 7 I think is anyway a good compromise.

hackermd commented 2 years ago

Thanks a lot for testing @Punzo. I have meanwhile introduced the change you suggested. Available since v0.42.0.

Punzo commented 2 years ago

perfect!

@fedorov I have updated IDC fork as well https://github.com/ImagingDataCommons/slim/pull/70 and Docs are ready to be merged https://github.com/ImagingDataCommons/IDC-Docs/pull/45