Support of 3D image split in multiple DICOM files

korbinian90 commented 5 years ago

Our application is DICOM MRI data from Siemens Scanners. The DICOM files contain 1 slice each, but we have 3D,4D,5D data sets. So a data set is split in multiple files.

It would be great to read in all DICOM files from one folder and merge them to a higher dimensional image. (This can be possible saved afterwards in Nifti format, which is receiving an integration to Images.jl)

The DICOM headers are all nearly identical, only slicenumber, echonumber, timepoint etc. changes. This information could be stored with ImageMetadata.jl. (compatible with other medical standards)

Tokazama commented 5 years ago

If someone is familiar with the header/metadata for DICOM files I could help with this. I've been working on something for the NIfTI format where you only need to get the axes (preferably in a tuple of Axis from AxisArrays), element type, and a place in the IOStream to format to to an image. It should support reading to various containers (StaticArrays,Arrays,etc) or element types (bit types or probably common structs like RGB). It also should support endian mapping.

ihnorton commented 5 years ago

You are likely aware, but just in case: I highly recommend dcm2niix for conversions to nifti (and now nrrd too). There’s an enormous amount of variability and heuristics required to do slice reconstruction reliably across manufacturers and scanner versions. dcm2niix has quite a lot of that, and is very easy to build (no complex dependency chains).

Thankfully, @neurolabusc has managed to get people to share a lot of public test sets, especially recently, which we can also test against. And of course the code is open. So in the long-term this is a nice idea — but be aware that even the huge python imaging community doesn’t have a native tool approaching dcm2niix’s coverage at this point.

Tokazama commented 5 years ago

I completely agree with you @ihnorton. A thorough knowledge of differences between scanners is key for developing this package into any sort of competitor for handling DICOMs. dcm2niix also has some really helpful features for speeding things up using pigz. I haven't found a fully optimized solution to compression in Julia yet (however, it looks like things are in place for this to happen soon if the right people act).

In my experience most neuroimaging software (preprocessing or statistical analysis) work with NIfTI files now. So it's typically an unpleasant trudge through organizing your image formats (preferably from the same type of scanner) and then you just use NIfTIs. So unless you are very knowledgeable about Siemens scanners use dcm2niix.

All that being said, much of neuroimaging is quite doable in Julia if you know what you're doing.

korbinian90 commented 5 years ago

Thank you for the helpful and detailed explanation, @Tokazama and @ihnorton. I didn't know about dcm2niix and the complexity involved in parsing DICOMs from different vendors.

neurolabusc commented 5 years ago

For anyone attempting to develop a robust DICOM conversion:

As @ihnorton noted, many individuals have allowed me to publicly share their datasets. I provided them on this web page.. The datasets are provided in context throughout the page, but if you search for the term mb) you can find them.
- Likewise, the repositories dcm_qa(Siemens), dcm_qa_nih(Siemens/GE) and dcm_qa_uih(UIH) provide both DICOM images and a suggested NIfTI/BIDS solution. These are used to test each commit of dcm2niix.
- There seem to be very few public examples of images from a few of the major vendors. If anyone has access to hardware from Toshiba, Hitachi, etc. and can contribute public domain samples it would be a great help. (I also know of know public examples of Philips enhanced DWI direct from the scanner, but @BennettLandman may be able to fill this gap).
- With regards to @ihnorton's comments regarding Python, I have been very impressed by dicom2nifti. In my experience, it is pretty solid unless you are dealing with (very rare) compressed transfer syntaxes (where dicom2nifti uses the excellent gdcm tools that are tough to compile) or some very rare corner cases. In my opinion, dcm2niix currently has a bit of a first mover advantage in that it became popular first and therefore benefitted from a lot of community feedback. If you are a Python advocate, simply using dicom2nifti and reporting or patching the corner cases could erase that gap.
- I am not familiar with Julia, but a keen Julia developer may want to look at Jon Clayden's divest for the R language. He has inserted a few #ifdef USING_R conditionals into the dcm2niix C code. This allows R users to have the performance benefit of C code, while having the benefits of a high language language for using the data. Since his conditionals are inserted into the core code, divest benefits as dcm2niix evolves for new variations of DICOM. Something similar could be done in Julia. I do recognize there are benefits from having image conversion natively in a higher level language (e.g. dicm2nii for Matlab) that is easy to script. However, I though I would mention this as an option.

notZaki commented 5 years ago

If the spatial resolution and slice positions are consistent across the data, then loading it directly from DICOM shouldn't be too much of a hassle. (Assuming DICOM.jl can read it in the first place)

The function below might be useful. It's fairly crude since it relies on using the Instance Number to re-arrange the data:

function dcmLoadAllDir(dcmDir, wantedNames::Array{String,1})
    wantedInfo = []
    allPixelData = Float64[]
    # Instance number is obligatory because it's often needed to re-order images
    if !("Instance Number" in wantedNames)
        push!(wantedNames, "Instance Number")
    end
    for (i, dcmFile) in enumerate(readdir(dcmDir))
        dcmData = DICOM.dcm_parse(joinpath(dcmDir,dcmFile))
        pixelData = float(dcmData[(0x7FE0, 0x0010)])
        infoData = [DICOM.lookup(dcmData, x) for x in wantedNames]
        # There's probably a better way of initializing than this if-statement
        if i==1
            wantedInfo = infoData
            allPixelData = pixelData
        else
            wantedInfo = hcat(wantedInfo, infoData)
            allPixelData = cat(allPixelData, pixelData, dims=3)
        end
    end
    # Store the information in a DataFrame for convenience
    infoDF = DataFrame(permutedims(wantedInfo, [2, 1]))
    names!(infoDF, [Symbol("$x") for x in wantedNames])
    # Some values are interpreted as unsigned ints - convert for convenience
    problematicColumns = ["Rows", "Columns"]
    for column in problematicColumns
        if (column in wantedNames)
            infoDF[Symbol(column)] = [convert(Int,x) for x in infoDF[Symbol(column)]]
        end
    end
    # Rearrange by instance number
    allPixelData[:,:,infoDF[Symbol("Instance Number")]] = allPixelData
    sort!(infoDF, Symbol("Instance Number"))
    return (allPixelData, infoDF)
end

This should load each dicom file as a new slice, so allPixelData will be a 3D array with dimensions Row-Column-NumberOfDicomFiles which has to be reshaped into an appropriate 4D or 5D array. This step will depend on how the data was acquired.

Here's an example script for variable flip angle data which is reshaped to a 4D array representing Row-Column-Slice-FlipAngle:

using DICOM, DataFrames
dcmDir = "path/to/dicom/folder"
wantedNames =   [
                "Patient ID",
                "Study Date",
                "Series Description",
                "Rows",
                "Columns",
                "Flip Angle",
                "Repetition Time",
                "Echo Time",
                ]

(allPixelData, infoDF) = dcmLoadAllDir(dcmDir, wantedNames)
(numRows, numColumns, numSlices_and_Flips) = size(allPixelData)
numFlips = length(unique(infoDF[Symbol("Flip Angle")]))
numSlices = Int(numSlices_and_Flips / numFlips)
allPixelData = reshape(allPixelData, (numRows, numColumns, numSlices, numFlips))

The 4D array can be visualized with ImageView.jl. In this case, the left slider represents slices and right slider represents flip angles. imshowvfa

The infoDF dataframe contains information extracted from the header---specified by wantedNames---which can be used for future processing steps.

julia> infoDF
96×9 DataFrame
│ Row │ Patient ID   │ Study Date │ Series Description    │ Rows  │ Columns │ Flip Angle │ Repetition Time │ Echo Time │ Instance Number │
│     │ Any          │ Any        │ Any                   │ Int64 │ Int64   │ Any        │ Any             │ Any       │ Any             │
├─────┼──────────────┼────────────┼───────────────────────┼───────┼─────────┼────────────┼─────────────────┼───────────┼─────────────────┤
│ 1   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 1               │
│ 2   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 2               │
│ 3   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 3               │
⋮
│ 93  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 93              │
│ 94  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 94              │
│ 95  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 95              │
│ 96  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 96              │

JuliaHealth / DICOM.jl

Support of 3D image split in multiple DICOM files #29