JuliaHealth / DICOM.jl

Julia package for reading and writing DICOM (Digital Imaging and Communications in Medicine) files
MIT License
56 stars 21 forks source link

Old DICOM file loads using `dcm_parse(file; preamble=false)`, but the same file returns `false` when running`isdicom` function #84

Open Dale-Black opened 2 years ago

Dale-Black commented 2 years ago

I am trying to load a directory of DICOM files using dcmdir_parse. Each file in the directory loads when running dcm_parse(file; preamble=false) but loading the directory returns an empty array. While trying to determine the cause, I found that isdicom(file) returns false, but dcm_parse(file; preamble=false) works. Any idea if this is an issue or if I am missing something on my end?

image

notZaki commented 2 years ago

That is kinda the expected, albeit annoying, behaviour of isdicom() because it uses the preamble + prefix to identify dicom files as that's what the dicom file format suggests [ref].

On the package side, we could update isdicom() so that it also tries to read the first couple of bytes and if they look like valid dicom, then to assume that the rest of the file is dicom. Either that, or make it skip isdicom() if preamble=false is used.

Dale-Black commented 2 years ago

What about modifying find_dicom_files to not even need isdicom()? This is a super hacky example but seems to work at least for my use case

function find_dicom_files(dir; kwargs...)
    files = joinpath.(dir, readdir(dir))
    dicom_files = []
    for file in files
        try
            dcm = dcm_parse(file; kwargs...)
            push!(dicom_files, file)
        catch
            nothing
        end
    end
    return dicom_files
end
Dale-Black commented 2 years ago

This would work for dcmdir_parse, at least in my use case

image

notZaki commented 2 years ago

Since that function is already calling dcm_parse, you could make the following change to return the parsed data (maybe change the function name to reflect this):

function find_dicom_files(dir; kwargs...)
    files = joinpath.(dir, readdir(dir))
-   dicom_files = []
+   dcms = [] 
    for file in files
        try
            dcm = dcm_parse(file; kwargs...)
-           push!(dicom_files, file)
+           push!(dcms, dcm)
        catch
            nothing
        end
    end
-    return dicom_files
+    return dcms
end

or if everything in the folder is a dicom file, then the following could work too:

dir = "PATH/TO/DIR"
dcms = [DICOM.dcm_parse(file; preamble=false) for file in readdir(dir; join=true)]

The try/catch is a good practical solution, however, I think it would be better for the package if isdicom() could recognize files with missing preambles rather than changing the find_dicom_files directly.