tee-ar-ex / trx-python

Python implementation of the TRX file format
https://tee-ar-ex.github.io/trx-python/
BSD 2-Clause "Simplified" License
22 stars 15 forks source link

strange behavior for `intersection` #77

Open skoudoro opened 10 months ago

skoudoro commented 10 months ago

Hi @frheault and trx team,

I was working on the migration of validate_tractogram to DIPY (trying to understand the logic) and it seems that the option remove_identical_streamlines is not working as expected.

Looking deeper, it seems that intersection ignore flipped streamline. here an exemple:

arr = np.arange(90).reshape((30, 3))
stream = [np.flipud(arr), arr]
stream2 = [arr + 1, np.flipud(arr) + 1]

res, indices_uniq = perform_streamlines_operation(intersection, [stream, stream2])

I was expecting some results but I got ([], array([], dtype=uint32))

I do not understand anymore why I got a result when I do res, indices_uniq = perform_streamlines_operation(intersection, [stream])

Why the input should be a list of tractogram and not just one tractogram? (concatenate tractogram could be use if people want multiple tractogram in one).

Thank you in advance for the clarification

frheault commented 10 months ago

The intersection/union/difference requires identical streamlines (no flip) because we are using a hash approach to get Nlog(N) time instead of N^2, even a 0.01mm shift or an extra point or a flip will mess up the hash process. It is not a script to segment or identify almost identical streamlines, so the behavior you observed is expected.

The classical use case is more like: You have a tractogram and, say, a manual segmentation and want to know the indices in the original. In this situation, the streamlines are exactly the same.

The intersection/union/difference operation is multi-input by design, requiring a list. Some of the behaviors work on a single tractogram, but it has to support multiple (hence using a list)