Seg Array Optimization - Githubissues

CPBridge commented 1 year ago

A few improvements focused on improving the efficiency of building segmentation arrays from loaded Segmentation objects.

This was achieved through:

Re-ordering loops in the construction of the seg_frames_matrix to avoid searching through the large _frames_lut more than necessary.
Creating a specialized case for the combine_segments option, rather than implementing it as a postprocessing of the standard routine. This reduces memory consumption significantly as it is now O(1) rather than O(n) in the number of segments requested.
Being more careful with dtypes, and allowing the user to specify a dtype
Adding an option to skip checks for overlapping segments (skip_overlap_checks), which results in a further modest speed-up.

@pieper you should now find that your example from #202 runs significantly faster and uses much less memory than before when combine_segments=True (especially in the case where the full set of segments is requested).

I went a little way down the path of JITing the methods with numba, but it actually slowed things down. There may be a bit more performance improvement possibly using threading, but we are hitting on some numpy limitations

pieper commented 1 year ago

Thanks @CPBridge ! Let me know if you need me to test before you merge or i can just wait until @hackermd has had a look through it.

CPBridge commented 1 year ago

Thanks Steve. If you do have time to test on your hardware at some point during the process that would be very helpful. I think before or after @hackermd has a look would be fine, I don't expect the core numerical process to change much

CPBridge commented 1 year ago

A couple of other points to note:

It will still be slow to load the file initially, that will require a fix in pydicom.
Please try with and without the skip_overlap_checks option and see how this affects things

hackermd commented 1 year ago

Thanks @CPBridge. I will take a look.

It will still be slow to load the file initially, that will require a fix in pydicom

See https://github.com/pydicom/pydicom/pull/1734

CPBridge commented 1 year ago

Ok @hackermd @pieper I completely refactored the way the lookup table is stored and queried and the result is much more efficient and in my opinion much less confusing than current master and even the previously reviewed version on this branch (which was itself much more efficient than master).

Previously I had implemented a lookup table as a numpy array and then wrote some frankly quite hacky code to "query" the lookup table to find the information I needed. I realised at the time that I was essentially implementing a relational database in numpy and that this was not a good idea, but I didn't want to introduce a new dependency...

...but at some point since then I became aware that a basic sqlite client is in the Python standard library and that this could deal with in-memory databases. I have refactored the entire lookup operation to use the standard library sqlite to construct and store the lookup table. Using sql syntax is much more ergonomic and in certain situations much more efficient than my nasty numpy implementation. In particular small queries are now very very fast. And all with no new dependencies. I'm just sad I didn't think of this before.

We should also consider this sort of approach elsewhere, e.g. storing a LUT of elements of an SR.

Putting this in the context of reducing overall seg parsing time, this has dramatically improved the efficiency of the "pixel reconstruction step". I'm now getting times of < 1s on my macbook for the full set of frames and segments on @pieper 's segmentation from #202 if I use skip_overlap_checks. (If someone can think of a way to make those checks more efficient I would love some ideas...). I believe there are still some gains to be had in the _build_luts method in the Segmentation.from_dataset() method (and segread) to extract the information to place into the SQL database from the PerFramesFunctionalGroupSequence (actually entering the information into the SQL database is very fast). But this PR is getting big and I think that is a separate thread that we can leave for the future.

CPBridge commented 1 year ago

@hackermd thanks for your comments, I have made some updates.

I also snuck in another change in 8e95876 to change behaviour that annoyed me recently.

The code to extract a segmentation mask for a list of frames is very conservative: if you give it a frame number that is not referenced by the segmentation it will by default error out. This makes sense when thinking about SOPInstanceUIDs of single frame referenced images. But in the multiframe referenced image case, I think it reasonable to assume that if a requested frame number is not referenced in the segmentation image, but a frame with a frame number higher than the requested frame is referenced, then the requested frame exists in the referenced image and the corresponding 2D mask should be returned as zeros. So 8e95876 switches to this behaviour.

CPBridge commented 1 year ago

@hackermd I have refactored out all the database logic and I think this is ready to go

ImagingDataCommons / highdicom

Seg Array Optimization #208