imglib / imglib2-roi

Regions of interest (ROIs) and labelings for ImgLib2
Other
8 stars 8 forks source link

Improve memory usage and performance of LabelingMapping. #58

Closed maarzt closed 3 years ago

maarzt commented 3 years ago

This PR contains many ideas to improve the performance of LabelingMapping.

Changes by @tpietzsch

Changes by @maarzt

maarzt commented 3 years ago

@tpietzsch There is much room for improvement still:

  1. The importened job that InternedSet currently does is mapping a list of ints to Set. This functionallity could by moved into LabelingType, we wouldn't need the InternedSet objects anymore. The Map<SortedInts, InternedSet> internedSets could be changed to TObjectIntMap setToIndex than.
  2. SortedInts could be replaced in a mastodon-collection way. The difficult part would be to implement something functionally equivalent to HashMap<SortedInts, Integer>.

Both changes combined would remove all jave object header overhead in LabelMapping.

tpietzsch commented 3 years ago

@maarzt Nice!

As I suspected, making the object/id bimap thread-safe is a problem. The synchronization you put in is not sufficient. In particular, it is not safe to call labelToId.get(object) while a concurrent labelToId.put(object, id) is happening. It is not guaranteed that the get() will see the state either before or after the put(). It might see something else (looking at the source code of TObjectIntHashMap).

I think it might be ok to just synchronize every access to the bimap. I'll go over it again.

tpietzsch commented 3 years ago

I revised it a bit, and synchronizing every access to the bimap doesn't seem to make a difference. Neither for LabelingBenchmark nor for the use case in metaseg.