ImagingDataCommons / IDC-WebApp

Web Application front end for IDC (CORE REPO)
Apache License 2.0
6 stars 2 forks source link

Add button at the series level to open in VolView #1221

Closed fedorov closed 9 months ago

fedorov commented 1 year ago

We discussed this at the F2F meeting. Clicking this button would build a URL that points VolView to the series folder and open it in the app deployed at https://volview.kitware.app/. Overall, this will help us move towards an interface where we would be able to relatively easily add integration with externally hosted viewers. Open questions: whether we should disable any modalities (probably, at least SM), what to do with opening at the study level.

fedorov commented 1 year ago

Per discussion with @aylward via email, the preference seems to be to allow opening at the study level.

Since bucket egress is free, we probably prefer to use buckets and not DICOMweb for VolView, right @wlongabaugh?

If we use buckets, and need to pass a list of series-level folders comprising the entire study, do you foresee any difficulties regarding the URL length @s-paquette?

s-paquette commented 1 year ago

@fedorov There IS a limit on URL length for all browsers; Chrome and Safari are very lenient at 65k and 80k, but Chromium browsers only allow 2.1k. So we won't be able to pass more than that.

aylward commented 1 year ago

Is there a way to access a database that maps patients/studies/series to buckets? In particular, perhaps that database has metadata that is useful for providing a user with an overview of available data in each bucket without us having to parse the buckets? Or perhaps we should consider a hybrid approach - mixing dicomweb queries for metadata (e.g., thumbnails) with payload/pixel downloads coming from buckets?

pieper commented 1 year ago

Hi @aylward yes, there's the big query database where you can do those queries. This link should get you started:

https://learn.canceridc.dev/cookbook/bigquery

Let us know if you run into issues - maybe on the IDC discourse?

aylward commented 1 year ago

@floryst

fedorov commented 11 months ago

@s-paquette do you see any problem if we pass the list of all bucket folders for a given series to support VolView at the study level as well as the series level?

We should discuss whether we want to implement this feature along with introducing OHIF v3 as yet another alternative for visualization.

aylward commented 11 months ago

Related: spoke with AWS folks - perhaps the AWS bucket/folder corresponding to each series could also be listed in the bigquery database?

fedorov commented 11 months ago

They are! See the "Downloading the cohort" section in https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part3_exploring_cohorts.ipynb. Both GCS and AWS as sources are discussed.

PaulHax commented 11 months ago

Exciting! VolView will try to load all files under a bucket key when passed as a s3 or g3 protocal URL.

Example url parameter: https://volview.kitware.app/?urls=[s3://idc-open-data/262f1166-22e1-4eed-a2fe-c899e995640c/]

VolView URL parameter docs

These S3 bucket settings worked for me.

Bucket Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AddPerm",
            "Effect": "Allow",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::mydicoms/*",
                "arn:aws:s3:::mydicoms"
            ]
        }
    ]
}

CORS:

[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "GET"
        ],
        "AllowedOrigins": [
            "*"
        ],
        "ExposeHeaders": []
    }
]

VolView code dealing with s3 URLs: https://github.com/Kitware/VolView/blob/main/src/io/amazonS3.ts

fedorov commented 11 months ago

@PaulHax We already went through the process of configuring IDC buckets to work with VolView, and it already works. Have you talked with @floryst?

You can see the details in this release announcement: https://discourse.canceridc.dev/t/idc-may-2023-release/428. This also reminded me that I have not updated the documentation and our tutorial notebook to include the instructions I have in the announcement. I will do this today.

fedorov commented 11 months ago

I updated the relevant part of the getting started tutorial to include VolView instructions: https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part3_exploring_cohorts.ipynb.

I also updated the visualization doc page to highlight the various visualization tools we recommend to the users at the very top of this page: https://learn.canceridc.dev/portal/visualization.

@PaulHax if you see something is not right, please let me know!

PaulHax commented 11 months ago

@fedorov Very complete docs in the cohort notebook! Love the itkwidgets stuff.

VolView fails to load the (histopathology?) series in notebook output at the end of the VolView section. https://volview.kitware.app/?urls=s3://idc-open-data/da64c73a-4634-4e75-a839-813c9f4f629a

Some folks may be interested to know ?urls can be an array to download multiple series. https://volview.kitware.app/?urls=[s3://idc-open-data/262f1166-22e1-4eed-a2fe-c899e995640c,s3://idc-open-data/48a29c97-bb71-4f8f-b70b-688f36bca0c1]

fedorov commented 11 months ago

Very complete docs in the cohort notebook! Love the itkwidgets stuff.

That one is the contribution from @bnmajor!

VolView fails to load the (histopathology?) series in notebook output at the end of the VolView section.

Good catch! I was curious what would happen if I try to open path image in VolView, and did not re-run the cell after that experiment. I fixed it to select CT series. For the IDC portal integration we can definitely exclude slide microscopy.

Some folks may be interested to know ?urls can be an array to download multiple series.

Agreed - I now added that information to the notebook. Thank you!

aylward commented 11 months ago

Hi Andrey,

Thanks for the updates to the notebooks and IDC!

With multiple URLs being passed on a volview URL and with the lowest URL length limit seeming to be 2500, do you think a button to open data at the study-level would work?

Stephen

On Fri, Sep 29, 2023 at 11:53 AM Andrey Fedorov @.***> wrote:

Very complete docs in the cohort notebook! Love the itkwidgets stuff.

That one is the contribution from @bnmajor https://github.com/bnmajor!

VolView fails to load the (histopathology?) series in notebook output at the end of the VolView section.

Good catch! I was curious what would happen if I try to open path image in VolView, and did not re-run the cell after that experiment. I fixed it to select CT series. For the IDC portal integration we can definitely exclude slide microscopy.

Some folks may be interested to know ?urls can be an array to download multiple series.

Agreed - I now added that information to the notebook. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/ImagingDataCommons/IDC-WebApp/issues/1221#issuecomment-1741110912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACEJL7QLMKJW7WHHHOQQNLX43VHPANCNFSM6AAAAAAZ2YB334 . You are receiving this because you were mentioned.Message ID: @.***>

-- Stephen R. Aylward, Ph.D. Chair, MONAI Advisory Board Senior Director, Strategic Initiatives, Kitware

fedorov commented 10 months ago

@aylward I will do checks to see how many studies (if any!) will result in a URL that would exceed that threshold. One such URL will be on the order of 50 characters. As discussed with @s-paquette, we can also disable study-level buttons whenever that threshold is exceeded.

Another question to confirm: I think we should disable visualization of slide microscopy images, and also at the series-level we would disable buttons corresponding to the series that are not meaningful without images (ie, SR, SEG, RTSTRUCT). Sounds good?

fedorov commented 10 months ago

Mockup of the UI for selecting alternative viewer (triangle will be inside the selection button on the top right row)

image
fedorov commented 10 months ago

@s-paquette here is the query that generates study-level URLs for VolView:

SELECT
  StudyInstanceUID,
  CONCAT("https://volview.kitware.app/?urls=[",STRING_AGG(DISTINCT(REGEXP_SUBSTR(aws_url, "(s3://.*)/")),','),"]") AS volview_url,
  LENGTH(CONCAT("https://volview.kitware.app/?urls=[",STRING_AGG(DISTINCT(REGEXP_SUBSTR(aws_url, "(s3://.*)/")),','),"]")) AS volview_url_len,
  STRING_AGG(DISTINCT(Modality),',') AS modalities,
  STRING_AGG(DISTINCT(collection_id),',') AS collections
FROM
  `bigquery-public-data.idc_current.dicom_all`
GROUP BY
  StudyInstanceUID
ORDER BY
  volview_url_len DESC

@aylward only 448 studies will have URL exceeding the 2500 limit, so it will work for most cases. Result of running the query below is here.

WITH
  temp_table AS (
  SELECT
    StudyInstanceUID,
    CONCAT("https://volview.kitware.app/?urls=[",STRING_AGG(DISTINCT(REGEXP_SUBSTR(aws_url, "(s3://.*)/")),','),"]") AS volview_url,
    LENGTH(CONCAT("https://volview.kitware.app/?urls=[",STRING_AGG(DISTINCT(REGEXP_SUBSTR(aws_url, "(s3://.*)/")),','),"]")) AS volview_url_len,
    STRING_AGG(DISTINCT(Modality),',') AS modalities,
    STRING_AGG(DISTINCT(collection_id),',') AS collections
  FROM
    `bigquery-public-data.idc_current.dicom_all`
  GROUP BY
    StudyInstanceUID
  ORDER BY
    volview_url_len DESC)
SELECT
  *
FROM
  temp_table
WHERE
  volview_url_len > 2500
  AND modalities NOT LIKE "%SM%"
fedorov commented 10 months ago

Here's one of the longer URLs, just to give the idea (1771 characters): https://volview.kitware.app/?urls=[s3://idc-open-data/7d69c507-45cc-4e65-a187-a3a13b6c26de,s3://idc-open-data/f23e565c-d4f1-4bcf-9bb7-bf8b011c305e,s3://idc-open-data/7846d2b4-ab96-4e0e-a83e-8824695ccae1,s3://idc-open-data/771f637b-c9e9-4137-93a9-db73b32cb2ca,s3://idc-open-data/d319dd94-9026-44ed-a079-1420006ab8ac,s3://idc-open-data/4d1c65d8-e350-4f85-8073-10b07a896488,s3://idc-open-data/c185caa5-92bf-4a27-9f4c-4d322f6ea3ce,s3://idc-open-data/8f0e47d9-2df2-4cf5-882b-2e815d751909,s3://idc-open-data/7bd4afc5-d61b-45a9-93bc-61bcd6451510,s3://idc-open-data/9ade2325-ebf4-45dc-99d3-1acd47e5888b,s3://idc-open-data/eb62a53d-6868-4e3a-8c21-f984c1667273,s3://idc-open-data/ecca34ee-2cbf-49a5-80f4-d6d7f5406e5a,s3://idc-open-data/1728adea-8410-4f97-81f2-94a575e152c9,s3://idc-open-data/d58e4362-46b2-4de9-aa18-bb968dc4a0cb,s3://idc-open-data/94ec0707-a102-4ace-bd56-c3c4e8e6341a,s3://idc-open-data/4604ca6b-26ee-45bd-8342-19653e3a6b17,s3://idc-open-data/0ff91948-7a4c-48fe-a5db-121daa111be7,s3://idc-open-data/3f64ada9-4bb7-4a5d-a95a-45cfaa62322a,s3://idc-open-data/20344dfd-8009-45d4-adab-eac2677b8db5,s3://idc-open-data/ba404853-9b5e-4eb0-95d5-653c52c1c48c,s3://idc-open-data/86dbf5c1-3300-4fd6-826d-f82365cfa089,s3://idc-open-data/df44bcfa-32e1-4ae3-8775-b8e46bc3a218,s3://idc-open-data/7cfbe8cf-d5c5-456e-a960-34a8f3c5dcad,s3://idc-open-data/ca15769c-5a98-4737-a229-25be55f6ae8f,s3://idc-open-data/09bd1507-e296-402f-88d3-a7e4faa2b1ef,s3://idc-open-data/4a4547f6-b2d5-4429-8419-479997954c31,s3://idc-open-data/1942bc22-96ed-4633-8a99-7c17ad9d1111,s3://idc-open-data/dec57b3e-15ce-4845-a46b-d6276d96e8ac,s3://idc-open-data/618ad858-53dc-489c-ad6a-071f52833825,s3://idc-open-data/45280d9c-b77b-4846-97cf-9ff2f00da017,s3://idc-open-data/f5ca5865-63c3-4bb0-b402-c662061bc2af]

And OH WOW @aylward I had no idea - DICOM SEG just works in VolView! This is incredible - wonderful surprise @floryst - you did deliver on the promise! 👍

2023-10-19_18-12-57

FYI @dclunie @pieper

aylward commented 10 months ago

HI @fedorov,

Thank you for all of this work!

Another question to confirm: I think we should disable visualization of slide microscopy images, and also at the series-level we would disable buttons corresponding to the series that are not meaningful without images (ie, SR, SEG, RTSTRUCT). Sounds good?

Sounds good. We can include SEG objects, but RTSTRUCT, SR, etc support won't be ready until the next release.

Mockup of the UI for selecting alternative viewer (triangle will be inside the selection button on the top right row)

Sounds good too. Thanks!

only 448 studies will have URL exceeding the 2500 limit

I'm amazed that there are even 448 of these. We have loading and parsing the s5cmd manifest directly on our to-do list. That should address all cases, but in the meantime, this seems like an "ok" solution. Thanks!

DICOM SEG just works in VolView! This is incredible

Thanks goes to a wonderful contributor from pharma, @PaulHax, and @floryst. You can also use layers to visualize PET overlaid on CT and such. We need to work on the GUI and 3D viz, but the basics are there. All comments and suggestions are welcomed!

fedorov commented 10 months ago

@aylward thanks for the clarification - very exciting to see VolView developed so actively!

I think I found a small bug during this testing - report submitted in https://github.com/Kitware/VolView/issues/463.

fedorov commented 10 months ago

@s-paquette can you please add OHIF v2 to the drop-down list, and link it to the same URL that corresponds to the eye icon? @dclunie suggested this while reviewing this today, and I agree it makes sense, and does not hurt.

fedorov commented 10 months ago

Discussed this at the meeting today, and agreed that it does not hurt to add v2 link to the drop down.

Need to add warning about leaving .gov when clicking volview link.

aylward commented 10 months ago

I wonder how to word that warning. Don't want people to think that the data is going to another site. Perhaps shouldn't say "leaving .gov" but should say something like "Warning: Running an application in your browser on your machine. That application is provided by Kitware. All data loaded into VolView remains local to your machine and is never transferred to Kitware or other third party".

fedorov commented 10 months ago

@aylward we have a standard wording we use in other places that (I think) is prescribed by NCI. That was just a reminder for the dev team to add it.

On a completely different note: @s-paquette the URL in the test tier erroneously includes the entire bucket as the first item (s3://idc-open-data) - it is really bad, since I think this makes VolView try to load the entire bucket. Example: https://volview.kitware.app/?urls=[s3://idc-open-data,idc-open-data-cr/6e0f8b4e-a116-477d-822b-5adc13b764ae,s3://idc-open-data,idc-open-data-cr/2a137aef-46ae-4e8a-8c63-9a5be4569f9a,s3://idc-open-data,idc-open-data-cr/092e5a1f-dd71-44ca-a228-d2d2733ea257,s3://idc-open-data,idc-open-data-cr/b2fecd48-16a9-4c63-96aa-98425a1e3ebd,s3://idc-open-data,idc-open-data-cr/7357e3c2-5089-4bee-a924-f838b9a4ef64,s3://idc-open-data,idc-open-data-cr/0af46432-e0cc-4b6e-bcb0-1865a139d07f,s3://idc-open-data,idc-open-data-cr/fc4c8324-e412-4d35-84cb-9cb5322354b9,s3://idc-open-data,idc-open-data-cr/172e5102-8fe4-4f9f-83ce-be2c190d4d4e,s3://idc-open-data,idc-open-data-cr/f26b9acd-0a4e-4e4a-94cb-80ff3c08585f,s3://idc-open-data,idc-open-data-cr/5e44aaff-a9c0-432f-a828-ca7e9d0b5be0,s3://idc-open-data,idc-open-data-cr/d625f05b-e34c-4fc0-83cd-41dfd1427528,s3://idc-open-data,idc-open-data-cr/242e38aa-3de0-4a19-8aa3-1fb80f811b0f]

Looking at the content in the brackets, there are other errors as well.

s-paquette commented 10 months ago

@fedorov I am not getting this consistently--can you give me some specific cases and series which are doing this? It implies some of the entries have a 'blank' series under the CRDC UUID entry.

fedorov commented 10 months ago

Per follow up discussion on slack, this is happening for the studies that have series spread across more than one bucket. Only one collection - NSCLC-Radiomics - is affected. Due to the implementation constraints, we currently cannot efficiently provision URLs for studies that are affected, and VolView link will be disabled for NSCLC-Radiomics.

pgundluru commented 9 months ago

NSCLC - Radiomics is greyed out on study level for volview image

fedorov commented 9 months ago

NSCLC - Radiomics is greyed out on study level for volview

This is expected, since that collection contains files across different buckets, which makes it tricky to generate VolView URL.