Open strixy16 opened 1 year ago
Hello, I have had a similar problem on a dataet, wher no file was found. Uponfurther inspection, it seems, that the condition for recursive search with glob (in src/imgtools/utils/crawl.py, l.17) is too strict. Inded, it only looks for files ending in ".dcm", which is not always the case for DICOM files :)
I simply changed the condition to "*", to include all files. This allowed the tool to find my patients and is actually what is present in the article's branch F1000Research
Hope this helps !
Overall, this strict matching of only "*.dcm" is a problem in multiple places in the code, for example further down the line I had the same issue with RT Structure Set files conversion
Sometimes thee files will end in .dcm, other times .DCM, other times no suffix at all !
I think it would be necessary to check the files are DICOM another way, to make this tool agnostic to the filename suffix :)
Running
autopipeline /Users/katyscott/Documents/SARC021/images/ /Users/katyscott/Documents/SARC021/med-imageout/ --n_jobs 1 --update --overwrite
doesn't find all of the CT and RTSTRUCT files in the images directory.My images directory contains four directories total - one sample has two directories each. Each sample directory contains subdirectories containing CT and RTSTRUCTs as DICOMs. There are three different CT scans for each sample and RTSTRUCTs associated with most of them.
The output of the crawl only finds one of the three sets of CT and RTSTRUCT combinations for the first sample and two of the three CTs and one RTSTRUCT set for the second sample.
When I call the crawl_one function on its own, it appears to find all of the files. So somewhere between this and the output, the files are getting lost.