Closed FabianHoerst closed 10 months ago
You may be interested in reviewing the similar project from the last year: https://projectweek.na-mic.org/PW38_2023_GranCanaria/Projects/IDC_DICOM_WSI_workflow/.
I think @dclunie @maxfscher @DanielaSchacherer and me would be all interested to join.
I am aware of seven current openly available implementations for the conversion of WSI data into DICOM:
This does not include commercial products.
Also of interest may be:
A key feature in any converter, IMHO, is to be able to losslessly convert (i.e., without decompressing JPEG or JPEG 2000 and recompressing) when possible, e.g., to take SVS tiles and copy them in their compressed form into DICOM frames. Several of the converters listed earlier have that feature.
David’s list might have included this work already, but I thought of this effort, which I have tested, and didn’t see the GitHub link in prior emails. In my test a few months ago, it didn’t encode the headers correctly to work with Slim, but the software effort seems well done. Maybe it can be built upon for the solution to this project. imi-bigpicturegithub.comOn Dec 20, 2023, at 12:10 PM, David Clunie @.***> wrote: A key feature in any converter, IMHO, is to be able to losslessly convert (i.e., without decompressing JPEG or JPEG 2000 and recompressing) when possible, e.g., to take SVS tiles and copy them in their compressed form into DICOM frames. Several of the converters listed earlier have that feature.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you sincerely for engaging in this discussion and for providing the tools!
I already tested some of the tools a while ago. While all of the tools seem to generate somewhat valid DICOM files, I've encountered challenges in integrating them into different frameworks, such like OHIF/SLIM/QuPath/OpenSlide. Moreover, some tools have limitations with specific file formats or are not optimized for handling larger files exceeding 2GB. I haven't had the opportunity to thoroughly evaluate all the tools yet. However, I believe it would be highly beneficial to carefully examine the integration issues, identify any missing tags, and consider strengthening the specified requirements.
What are your thoughts on this?
You may be interested in my experience in IDC creating as close to standard as possible DICOM WSI from SVS in a lossless manner with as many mandatory and optional data elements and values populated as possible, using out-of-band metadata sources (e.g., for specimen identification and description); see https://github.com/ImagingDataCommons/idc-wsi-conversion (and also the results of applying my dciodvfy validation tool to those images)..
I have used my own PixelMed conversion tool for IDC conversions up until now because I wanted to create dual-personality TIFF files and no other tool used to do this AFAIK, but we had the BioFormats people add this capability more recently.
That tool also handles > 2GB files (e.g., our RMS collection images are over 20GB in some cases). One of those SVS source images would be a good test case for pushing the limits of the other converters (including not only the large size but also the need for lossless conversion of what were raw (never lossy compressed) pixels).
Also, the current Qupath (0.5) does include both OpenSlide and BioFormats DICOM WSI support, since IDC funded both groups to develop those extensions, and if there are any deficiencies in reading DICOM images in either of those libraries, they should be addressed with issues reported and samples demonstrating any problems.
Another consideration is the form of the overall layout of the DICOM WSI in the converted result, and whether or not, e.g., they are TILED_FULL and omit the Per-Frame Functional Groups Sequence, etc., and what viewers support in this regard, versus what various different scanner vendor's DICOM output looks like (e.g., from Leica, 3DHISTECH, Hamamatsu, etc.). See also the test images from the ECDP 2023 Connectathon at ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG26/WG26Connectathon2023_ECDP/.
To add to what David said, here's the direct link that selects DICOM slide microscopy images in IDC Portal: https://portal.imaging.datacommons.cancer.gov/explore/filters/?Modality_op=OR&Modality=SM. We currently have over 23 TB of DICOM SM, and all of those were created using the workflow in https://github.com/ImagingDataCommons/idc-wsi-conversion. All of the images are available for download without login or any special permissions. Let me know if you need help.
Also here's the query that selects top 10 DICOM SM series by size.
SELECT
SeriesInstanceUID,
ANY_VALUE(collection_id) as collection_id,
ROUND(SUM(instance_size)/POW(10,9)) AS size_GB,
any_value(concat('s3://',aws_bucket,'/',crdc_series_uuid)) as aws_url
FROM
`bigquery-public-data.idc_current.dicom_all`
WHERE
Modality = "SM"
GROUP BY
SeriesInstanceUID
ORDER BY
size_GB DESC
LIMIT
10
The largest SM series (>100GB) are those from the HTAN-HMS
collection, which contain multichannel fluorescence images.
Next query is a slightly modified to consider only the RMS-Mutation-Prediction
collection, which consists of uncompressed H&E slides (largest is ~28GB).
SELECT
SeriesInstanceUID,
ANY_VALUE(collection_id) as collection_id,
ROUND(SUM(instance_size)/POW(10,9)) AS size_GB,
any_value(concat('s3://',aws_bucket,'/',crdc_series_uuid)) as aws_url
FROM
`bigquery-public-data.idc_current.dicom_all`
WHERE
Modality = "SM" and collection_id = "rms_mutation_prediction"
GROUP BY
SeriesInstanceUID
ORDER BY
size_GB DESC
LIMIT
10
You can see this tutorial series to get started with using BigQuery to search IDC data like in the above, how to download images and do other common operations: https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/getting_started.
Happy to help if anything is unclear!
Great to see the discussion on this - Fabian, I hope you can join in person or online at Project Week. Be sure to sign up https://projectweek.na-mic.org/PW40_2024_GranCanaria/
Thanks all for your feedback! I will create a project page in the next few days and take your discussion into account. I am still open to feedback to outline the project ASAP.
Project Description
Problem Despite various existing solutions for the conversion of WSI data into DICOM, there is a distinct lack of conversion tools (vendor agnostic) that result in DICOM files. Current solutions fall short in generating DICOM files compatible with OpenSlide (4.0.0) and OHIF/SLIM-Viewer, including a PACS, impeding seamless integration and compromising overall performance.
Objectives This project aims to develop an open-source, community-maintained software solution addressing the vendor-agnostic conversion of WSI data into DICOM format. The tool must adhere to established software design patterns, ensuring ease of contribution from the community.
Idea Our project aims to develop a vendor-agnostic WSI to DICOM conversion tool based on existing solutions. We plan to evaluate existing solutions comprehensively and build a test suite covering PACS (Orthanc), viewers (OHIF/SLIM), and Python integration (OpenSlide). The resulting DICOM-WSI should integrate with the OHIF viewer, offering a unified platform for pathology and radiology. Additionally, support for the SLIM viewer is necessary, as it does support adding annotations and visualizing analytics results (e.g., heatmaps).