JuliaHealth / DICOM.jl

Julia package for reading and writing DICOM (Digital Imaging and Communications in Medicine) files
MIT License
56 stars 21 forks source link

Support of 3D image split in multiple DICOM files #29

Open korbinian90 opened 5 years ago

korbinian90 commented 5 years ago

Our application is DICOM MRI data from Siemens Scanners. The DICOM files contain 1 slice each, but we have 3D,4D,5D data sets. So a data set is split in multiple files.

It would be great to read in all DICOM files from one folder and merge them to a higher dimensional image. (This can be possible saved afterwards in Nifti format, which is receiving an integration to Images.jl)

The DICOM headers are all nearly identical, only slicenumber, echonumber, timepoint etc. changes. This information could be stored with ImageMetadata.jl. (compatible with other medical standards)

Tokazama commented 5 years ago

If someone is familiar with the header/metadata for DICOM files I could help with this. I've been working on something for the NIfTI format where you only need to get the axes (preferably in a tuple of Axis from AxisArrays), element type, and a place in the IOStream to format to to an image. It should support reading to various containers (StaticArrays,Arrays,etc) or element types (bit types or probably common structs like RGB). It also should support endian mapping.

ihnorton commented 5 years ago

You are likely aware, but just in case: I highly recommend dcm2niix for conversions to nifti (and now nrrd too). There’s an enormous amount of variability and heuristics required to do slice reconstruction reliably across manufacturers and scanner versions. dcm2niix has quite a lot of that, and is very easy to build (no complex dependency chains).

Thankfully, @neurolabusc has managed to get people to share a lot of public test sets, especially recently, which we can also test against. And of course the code is open. So in the long-term this is a nice idea — but be aware that even the huge python imaging community doesn’t have a native tool approaching dcm2niix’s coverage at this point.

Tokazama commented 5 years ago

I completely agree with you @ihnorton. A thorough knowledge of differences between scanners is key for developing this package into any sort of competitor for handling DICOMs. dcm2niix also has some really helpful features for speeding things up using pigz. I haven't found a fully optimized solution to compression in Julia yet (however, it looks like things are in place for this to happen soon if the right people act).

In my experience most neuroimaging software (preprocessing or statistical analysis) work with NIfTI files now. So it's typically an unpleasant trudge through organizing your image formats (preferably from the same type of scanner) and then you just use NIfTIs. So unless you are very knowledgeable about Siemens scanners use dcm2niix.

All that being said, much of neuroimaging is quite doable in Julia if you know what you're doing.

korbinian90 commented 5 years ago

Thank you for the helpful and detailed explanation, @Tokazama and @ihnorton. I didn't know about dcm2niix and the complexity involved in parsing DICOMs from different vendors.

neurolabusc commented 5 years ago

For anyone attempting to develop a robust DICOM conversion:

notZaki commented 5 years ago

If the spatial resolution and slice positions are consistent across the data, then loading it directly from DICOM shouldn't be too much of a hassle. (Assuming DICOM.jl can read it in the first place)

The function below might be useful. It's fairly crude since it relies on using the Instance Number to re-arrange the data:

function dcmLoadAllDir(dcmDir, wantedNames::Array{String,1})
    wantedInfo = []
    allPixelData = Float64[]
    # Instance number is obligatory because it's often needed to re-order images
    if !("Instance Number" in wantedNames)
        push!(wantedNames, "Instance Number")
    end
    for (i, dcmFile) in enumerate(readdir(dcmDir))
        dcmData = DICOM.dcm_parse(joinpath(dcmDir,dcmFile))
        pixelData = float(dcmData[(0x7FE0, 0x0010)])
        infoData = [DICOM.lookup(dcmData, x) for x in wantedNames]
        # There's probably a better way of initializing than this if-statement
        if i==1
            wantedInfo = infoData
            allPixelData = pixelData
        else
            wantedInfo = hcat(wantedInfo, infoData)
            allPixelData = cat(allPixelData, pixelData, dims=3)
        end
    end
    # Store the information in a DataFrame for convenience
    infoDF = DataFrame(permutedims(wantedInfo, [2, 1]))
    names!(infoDF, [Symbol("$x") for x in wantedNames])
    # Some values are interpreted as unsigned ints - convert for convenience
    problematicColumns = ["Rows", "Columns"]
    for column in problematicColumns
        if (column in wantedNames)
            infoDF[Symbol(column)] = [convert(Int,x) for x in infoDF[Symbol(column)]]
        end
    end
    # Rearrange by instance number
    allPixelData[:,:,infoDF[Symbol("Instance Number")]] = allPixelData
    sort!(infoDF, Symbol("Instance Number"))
    return (allPixelData, infoDF)
end

This should load each dicom file as a new slice, so allPixelData will be a 3D array with dimensions Row-Column-NumberOfDicomFiles which has to be reshaped into an appropriate 4D or 5D array. This step will depend on how the data was acquired.

Here's an example script for variable flip angle data which is reshaped to a 4D array representing Row-Column-Slice-FlipAngle:

using DICOM, DataFrames
dcmDir = "path/to/dicom/folder"
wantedNames =   [
                "Patient ID",
                "Study Date",
                "Series Description",
                "Rows",
                "Columns",
                "Flip Angle",
                "Repetition Time",
                "Echo Time",
                ]

(allPixelData, infoDF) = dcmLoadAllDir(dcmDir, wantedNames)
(numRows, numColumns, numSlices_and_Flips) = size(allPixelData)
numFlips = length(unique(infoDF[Symbol("Flip Angle")]))
numSlices = Int(numSlices_and_Flips / numFlips)
allPixelData = reshape(allPixelData, (numRows, numColumns, numSlices, numFlips))

The 4D array can be visualized with ImageView.jl. In this case, the left slider represents slices and right slider represents flip angles. imshowvfa

The infoDF dataframe contains information extracted from the header---specified by wantedNames---which can be used for future processing steps.

julia> infoDF
96×9 DataFrame
│ Row │ Patient ID   │ Study Date │ Series Description    │ Rows  │ Columns │ Flip Angle │ Repetition Time │ Echo Time │ Instance Number │
│     │ Any          │ Any        │ Any                   │ Int64 │ Int64   │ Any        │ Any             │ Any       │ Any             │
├─────┼──────────────┼────────────┼───────────────────────┼───────┼─────────┼────────────┼─────────────────┼───────────┼─────────────────┤
│ 1   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 1               │
│ 2   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 2               │
│ 3   │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 15.0       │ 5.723           │ 0.716     │ 3               │
⋮
│ 93  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 93              │
│ 94  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 94              │
│ 95  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 95              │
│ 96  │ TCGA-06-0185 │ 20071116   │ 3D DCE T1 MAPPING X 5 │ 256   │ 256     │ 10.0       │ 5.778           │ 0.7       │ 96              │