dgobbi / vtk-dicom

A set of classes for using DICOM in VTK.
BSD 3-Clause "New" or "Revised" License
258 stars 94 forks source link

Terminating early when parsing with a query #176

Open dgobbi opened 5 years ago

dgobbi commented 5 years ago

When a query is given to the DICOM parser, it scans the file and skips any data elements that are not in the query. This is a valuable optimization. However, even skipping an element requires IO and CPU resources, since the element header must be read to get the length. Further optimization could be achieved if parsing terminated completely as soon all elements in the query were checked. The only thing currently blocking this optimization is the GetPixelDataFound() method of the parser. In order to do this check, the parser must continue until it reaches the 0x7FE0 group.

In most circumstances, this optimization would only speed things up by a few percent at most. But it could be valuable for files with particularly large metadata. Again, note that the ability to check for pixel data is lost if the parser terminates before it gets there.

Another optimization, particularly useful for the 'dicomfind' tool, would be to terminate the parser as soon as a query item does not match the data set. This can typically speed things up by 50%. However, it has caveats: not only does it invalidate the GetPixelDataFound() check, it also interferes with our current dicom sorter which expects to get sorting keys such as StudyDate even for data sets where the query doesn't match. So the sorter would have to be aware that this optimization was being used, so that files are rejected before the sort occurs. This is only possible when sorting at the 'image' level, it doesn't work when sorting at the 'series' or 'study level'.

I'm not sure if I will pursue this optimization.