DigitalSlideArchive / digital_slide_archive

The official deployment of the Digital Slide Archive and HistomicsTK.
https://digitalslidearchive.github.io
Apache License 2.0
105 stars 49 forks source link

Olympus cellSens VSI support? #248

Closed andreped closed 1 year ago

andreped commented 1 year ago

Hello! Just came across this fantastic project, and found it extremely relevant for a variety of my projects.

We wish to use DSA to be able to visualize the WSIs that lie on a server, but I was surprised to find that I was unable to visualize a WSI stored in the cellSens VSI format, and it is resulted in the UI to hang and not displaying the image at all. HistomicsUI works fine with other openslide formats such as Aperio SVS, but the Olympus' format is the most common format at my department. Hence, it would be great if DSA supported it.

I just tried to add a WSI from the OpenSlide test data. Pick any of these WSIs and try for yourself to reproduce the issue.

From what I can see here, DSA depends on large_files which supports Bioformats which you can use to read these images.

andreped commented 1 year ago

I think I see what the problem is. I tried to simply upload a single WSI (.vsi + corresponding folder) through the upload feature, and it would not let me.

So I guess it is likely that DSA does not yet support the cellSens VSI format, which requires you to have a .vsi file and a corresponding folder which contains the image data. Any comments, @manthey?

dgutman commented 1 year ago

It does I believe.. but the auto import process isn't tuned for that.. we assume one image per folder.. is there a sample data set posted anywhere that is similar enough to what your using ?

On Thu, Feb 16, 2023, 4:45 PM André Pedersen @.***> wrote:

I think I see what the problem is. I tried to simply upload a single WSI (.vsi + corresponding folder) through the upload feature, and it would not let me.

So I guess it is likely that DSA does not yet support the cellSens VSI format, which requires you to have a .vsi file and a corresponding folder which contains the image data. Any comments, @manthey https://github.com/manthey?

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1433892524, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTS6STMH5VZUVGFR4VLWX232VANCNFSM6AAAAAAU6YR74E . You are receiving this because you are subscribed to this thread.Message ID: <DigitalSlideArchive/digital_slide_archive/issues/248/1433892524@ github.com>

andreped commented 1 year ago

Download any of these zips from the openslide test data (see here), uncompress, and try to import and visualize with DSA.

Note that initially I was not uploading but importing, that is linking the data to a local drive, which in that case I was able to load both the ".vsi" and corresponding folder within the collection.

dgutman commented 1 year ago

Challenge excepted... I'm out of town but I'll give it a shot, if not I'll try again next week ... Vacation wifi ain't great..

On Thu, Feb 16, 2023, 5:00 PM André Pedersen @.***> wrote:

Download any of these zips from the openslide test data (see here https://openslide.cs.cmu.edu/download/openslide-testdata/Olympus/), uncompress, and try to import and visualize with DSA.

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1433903289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTS7KSNQ2JUYLWEJQPTWX25RLANCNFSM6AAAAAAU6YR74E . You are receiving this because you commented.Message ID: @.*** com>

andreped commented 1 year ago

No rush. Enjoy your vacation!

Might be that I'm just missing some dependency, like Java or something, as I believe javabridge is used for bioformats, but as I believe all this is running within docker, I would think this should just work, at least prompt an error if not. But idk.

andreped commented 1 year ago

I believe this is the cause why I'm unable to read the .vsi format:

dsa-girder-1     | Traceback (most recent call last):
dsa-girder-1     |   File "/opt/venv/lib/python3.9/site-packages/javabridge/jutil.py", line 290, in start_thread
dsa-girder-1     |     env = vm.create(args)
dsa-girder-1     |   File "_javabridge.pyx", line 654, in _javabridge.JB_VM.create
dsa-girder-1     | RuntimeError: Failed to create Java VM. Return code = -5
dsa-girder-1     | Failed to create Java VM
dsa-girder-1     | SLF4J: A number (4) of logging calls during the initialization phase have been intercepted and are
dsa-girder-1     | SLF4J: now being replayed. These are subject to the filtering rules of the underlying logging system.
dsa-girder-1     | SLF4J: See also http://www.slf4j.org/codes.html#replay

I would think all dependencies required were inside the docker image. Do I need anything locally? Right now I'm testing it on a macOS 12 Monterey laptop.

manthey commented 1 year ago

VSI files are read by the bioformats plugin, which runs in Java. The default docker images include all dependencies, including an appropriate version of java. I don't know why it if failing to start the jvm on your machine : -5 is usually the EIO error code, which is a cryptic IO error.

When bioformats reads the sample VSI files, it doesn't expose any information about lower resolution levels, so it ends up being very slow. They do eventually load and display.

andreped commented 1 year ago

When bioformats reads the sample VSI files, it doesn't expose any information about lower resolution levels, so it ends up being very slow. They do eventually load and display.

I'm testing this on a laptop with 16 GB VRAM with macOS 12. Might be that it just takes an extreme amount of time to load/render, as I have limited amount of cores and memory. Can test this on a Ubuntu server next week. Have you tried if the WSI I mentioned works with HistomicsUI on your local setup?

andreped commented 1 year ago

After a while, it seems like something is rendered from the OS-1.vsi (from the openslide test data here). However, it only seems like a single tile is rendered.

Are you able to reproduce the same on your end? As I have challenges getting DSA to work on my ubuntu 18.04 server currently (made separate issue about that here), I'm still stuck testing on my macbook with 16 GB RAM.

Skjermbilde 2023-02-19 172650 Skjermbilde 2023-02-19 172714

andreped commented 1 year ago

After about 1h, I was surprised to see that it has gotten further. Nowhere near rendering the full WSI. In my case I'm often working with images of the size 200k x 160k, and I assume the clinicians that will be using this solution will not be that pleased...

So the VSI reader does indeed work, it is just extremely slow for this large images. Perhaps this is resolved with more memory or something I can setup differently in docker? If it is just the reader that is slow, is it possible to make it faster? FAST has great support for reading cellSens VSI as well as the other OpenSlide formats. It is also possible to access from Python (see pyFAST).

Skjermbilde 2023-02-19 175537

dgutman commented 1 year ago

So I have never had good luck with bioformats .. we use it last as we try and open files. Perhaps playing around with java heap memory may help. If you try and open the same image with fiji using the bioformats plugin, is it much faster ?

On Sun, Feb 19, 2023, 10:00 AM André Pedersen @.***> wrote:

After about 1h, I was surprised to see that it has gotten further. Nowhere near rendering the full WSI. In my case I'm often working with images of the size 200k x 160k, and I assume the clinicians that will be using this solution will not be that pleased...

So the VSI reader does indeed work, it is just extremely slow for this large images. Perhaps this is resolved with more memory or something I can setup differently in docker? If it is just the reader that is slow, is it possible to make it faster? FAST https://github.com/smistad/FAST has great support for reading cellSens VSI as well as the other OpenSlide formats. It is also possible to access from Python (see pyFAST https://pypi.org/project/pyFAST/).

[image: Skjermbilde 2023-02-19 175537] https://user-images.githubusercontent.com/29090665/219962580-3c3ce4af-2ff2-442b-b16f-7b00ff206437.png

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1436038625, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTTGEWLEWMESL2HVUN3WYJGUTANCNFSM6AAAAAAU6YR74E . You are receiving this because you commented.Message ID: @.*** com>

andreped commented 1 year ago

I normally use QuPath if I just want to visualize these WSIs, which uses the same Bioformats plugin as Fiji does. It reads and renders the WSI in real time, and you can zoom around and pan the view without any real issue, similarly as for other pyramidal TIFF formats, such as Generic TIFF or Aperio SVS, which OpenSlide supports (see here for formats).

I have never had any success reading these gigantic WSIs in Fiji. I don't Fiji is suitable for it. Just tried now and it did not produce any image - could be that I just need to play around with some import settings however.

I've also co-developed FastPathology, which uses FAST to read the image (removing the need for Bioformats and javabridge-annoyances). Can also use pyFAST to read from the VSI format into numpy arrays, but reading and rendering is happening in C++. Hence, it should be really fast. There is an ongoing open-source project called LearnPathology where they were able to stream tiles using FAST and render with OpenSeadragon (which I see DSA also supports), hence, that should be possible.

QuPath, FastPathology, and the LearnPathology web solution renders this image a lot faster than DSA. So there is definitely room for improvement. But since QuPath has extremely fast reading speeds with Bioformats, I guess it should work, but as QuPath is a Java software and DSA runs Bioformats through Javabridge, that could explain the poor read speeds.

dgutman commented 1 year ago

If there are python bindings we can probably add that library as a tile source/ reader.. I'll look into it.

I don't use vsi files very often so we have never had a driving use case to try and speed up it's performance, but if that library is sufficiently easy we may be able to wrap it and avoid bioformats..

On Sun, Feb 19, 2023, 10:25 AM André Pedersen @.***> wrote:

I normally use QuPath https://github.com/qupath/qupath if I just want to visualize these WSIs, which uses the same Bioformats plugin as Fiji does. It reads and renders the WSI in real time, and you can zoom around and pan the view without any real issue, similarly as for other pyramidal TIFF formats, such as Generic TIFF or Aperio SVS, which OpenSlide supports (see here https://openslide.org/formats/ for formats).

I have never had any success reading these gigantic WSIs in Fiji. I don't Fiji is suitable for it. Just tried now and it did not produce any image - could be that I just need to play around with some import settings however.

I've also co-developed FastPathology, which uses FAST to read the image (removing the need for Bioformats and javabridge-annoyances). Can also use pyFAST to read from the VSI format into numpy arrays, but reading and rendering is happening in C++. Hence, it should be really fast. There is an ongoing open-source project called LearnPathology https://github.com/AICAN-Research/learn-pathology where they were able to stream tiles using FAST and render with OpenSeadragon (which I see DSA also supports), hence, that should be possible.

QuPath, FastPathology, and the LearnPathology web solution renders this image a lot faster than DSA. So there is definitely room for improvement. But since QuPath has extremely fast reading speeds with Bioformats, I guess it should work, but as QuPath is a Java software and DSA runs Bioformats through Javabridge, that could explain the poor read speeds.

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1436045465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTROVM2FTC4WFCX2TG3WYJJQHANCNFSM6AAAAAAU6YR74E . You are receiving this because you commented.Message ID: @.*** com>

dgutman commented 1 year ago

Also you said the magic word. Java.

On Sun, Feb 19, 2023, 10:25 AM André Pedersen @.***> wrote:

I normally use QuPath https://github.com/qupath/qupath if I just want to visualize these WSIs, which uses the same Bioformats plugin as Fiji does. It reads and renders the WSI in real time, and you can zoom around and pan the view without any real issue, similarly as for other pyramidal TIFF formats, such as Generic TIFF or Aperio SVS, which OpenSlide supports (see here https://openslide.org/formats/ for formats).

I have never had any success reading these gigantic WSIs in Fiji. I don't Fiji is suitable for it. Just tried now and it did not produce any image - could be that I just need to play around with some import settings however.

I've also co-developed FastPathology, which uses FAST to read the image (removing the need for Bioformats and javabridge-annoyances). Can also use pyFAST to read from the VSI format into numpy arrays, but reading and rendering is happening in C++. Hence, it should be really fast. There is an ongoing open-source project called LearnPathology https://github.com/AICAN-Research/learn-pathology where they were able to stream tiles using FAST and render with OpenSeadragon (which I see DSA also supports), hence, that should be possible.

QuPath, FastPathology, and the LearnPathology web solution renders this image a lot faster than DSA. So there is definitely room for improvement. But since QuPath has extremely fast reading speeds with Bioformats, I guess it should work, but as QuPath is a Java software and DSA runs Bioformats through Javabridge, that could explain the poor read speeds.

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1436045465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTROVM2FTC4WFCX2TG3WYJJQHANCNFSM6AAAAAAU6YR74E . You are receiving this because you commented.Message ID: @.*** com>

andreped commented 1 year ago

If there are python bindings we can probably add that library as a tile source/ reader.. I'll look into it.

Yes, there are. See here: https://pypi.org/project/pyFAST/

Here is an example on how to read patches as numpy arrays with pyFAST: https://fast.eriksmistad.no/generate_tissue_patches_from_wsi_8py-example.html

Using FAST to build up the tiles with OpenSeadragon worked wonders with OpenSeadragon. See here for how you could build up the tile server, at least thats how they did it in the LearnPathology project.

manthey commented 1 year ago

There was a straightforward way to speed up opening some files via bioformats (see https://github.com/girder/large_image/pull/1063). With this, the VSI files from https://openslide.cs.cmu.edu/download/openslide-testdata/Olympus/ open pretty quickly (without that change, they open tediously slowly).

Before this, one of those files took ~20 minutes to serve tiles on my Ubuntu machine, now it is a few seconds.

andreped commented 1 year ago

Before this, one of those files took ~20 minutes to serve tiles on my Ubuntu machine, now it is a few seconds.

Uuuuh! Sounds great! Will it work to clone the main branch of DSA now, or do I need to wait a little to get the latest release of large_image? I can run an experiment later today if it works :]

dgutman commented 1 year ago

Depends where your pulling large_image from, I'm not sure how long it takes before pypi updates.

On Mon, Feb 20, 2023 at 12:01 PM André Pedersen @.***> wrote:

Before this, one of those files took ~20 minutes to serve tiles on my Ubuntu machine, now it is a few seconds.

Uuuuh! Sounds great! Will it work to clone the main branch of DSA now, or do I need to wait a little to get the latest release of large_image? I can run an experiment later today if it works :]

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/248#issuecomment-1437319736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTVAFTY2CCIRQDNRLULWYOPQPANCNFSM6AAAAAAU6YR74E . You are receiving this because you commented.Message ID: @.*** com>

-- David A Gutman, M.D. Ph.D. Associate Professor of Neurology Emory University School of Medicine

manthey commented 1 year ago

pypi updated a bit ago.

andreped commented 1 year ago

pypi updated a bit ago.

I assume the docker image would need to be rebuilt and updated to get the latest changes no, as per installation instructions? Alternatively I could try building the image myself and testing it (for debugging and testing purposes).

manthey commented 1 year ago

The various docker images are currently being built and tested by CI and will show up within an hour.

andreped commented 1 year ago

Just tried the fix with the latest docker image. Works wonders and extremely rapid! Great work, @manthey!

Will do some more testing on much larger WSIs tomorrow on the production server, but I believe this should work fine for my application.

As this issue has now been resolved, you can close the issue.

Screenshot 2023-02-20 at 21 32 48