tee-ar-ex / trx-python

Python implementation of the TRX file format
https://tee-ar-ex.github.io/trx-python/
BSD 2-Clause "Simplified" License
22 stars 15 forks source link

strange behavior for `intersection` #77

Open skoudoro opened 8 months ago

skoudoro commented 8 months ago

Hi @frheault and trx team,

I was working on the migration of validate_tractogram to DIPY (trying to understand the logic) and it seems that the option remove_identical_streamlines is not working as expected.

Looking deeper, it seems that intersection ignore flipped streamline. here an exemple:

arr = np.arange(90).reshape((30, 3))
stream = [np.flipud(arr), arr]
stream2 = [arr + 1, np.flipud(arr) + 1]

res, indices_uniq = perform_streamlines_operation(intersection, [stream, stream2])

I was expecting some results but I got ([], array([], dtype=uint32))

I do not understand anymore why I got a result when I do res, indices_uniq = perform_streamlines_operation(intersection, [stream])

Why the input should be a list of tractogram and not just one tractogram? (concatenate tractogram could be use if people want multiple tractogram in one).

Thank you in advance for the clarification

frheault commented 8 months ago

The intersection/union/difference requires identical streamlines (no flip) because we are using a hash approach to get Nlog(N) time instead of N^2, even a 0.01mm shift or an extra point or a flip will mess up the hash process. It is not a script to segment or identify almost identical streamlines, so the behavior you observed is expected.

The classical use case is more like: You have a tractogram and, say, a manual segmentation and want to know the indices in the original. In this situation, the streamlines are exactly the same.

The intersection/union/difference operation is multi-input by design, requiring a list. Some of the behaviors work on a single tractogram, but it has to support multiple (hence using a list)