"Identical" shapes not found in event_lib

schuenke commented 4 months ago

Describe the bug When using RF pulses with identical settings but varying flip angles, pypulseq does not recognize the shapes to be identical and adds new shape IDs for each pulse. For example in MRF sequences this leads to huge seq-files that cannot be executed at the scanner.

To Reproduce Create a simple sequence with 2 rf pulses that have the same settings, but flip angles of 0.1 and 90 for example.

Due to the calculation of the magnitude of the pulses using:

    mag = np.abs(event.signal)
    amplitude = np.max(mag)
    mag /= amplitude

we end up with different mag arrays and thus also with different compressed shapes.

As the find_or_insert method doesn't take any tolerances into account, the shape is not found and a new one is inserted

Expected behavior The shape_id should be independent from the flip angle / precision errors.

Potential fix I would suggest to accept some tolerances and simply change line 100 in the find_or_insert method from

new_data = np.asarray(new_data)

to

new_data = np.round(new_data, 12)

For me, rounding it to 12 decimal places was enough to ensure all shapes are found to be the same. This is why I would suggest to include this directly in the event_lib method and not just in register_rf_event etc.

schuenke commented 4 months ago

Update: In my opinion it actually makes sense to reduce the precision to 9 decimal places, because this seems to be the precision of the values that are written to the seq-file

btasdelen commented 4 months ago

Hi @schuenke, we faced this exact issue before. At the time, we just worked around it by using forcing them to be registered as the same shape with register_rf_event.

I think rounding to 9 decimal places is very reasonable. I would also guess how many decimal places would result in identical arrays will depend on the scales of the waveforms.

Maybe we should also do the rounding in make_rf functions to keep the waveforms consistent?

FrankZijlstra commented 4 months ago

I think rounding is a pretty reasonable solution. However, it will incur a slight performance hit for sequence with a lot of arbitrary gradients (np.round is surprisingly "slow").

An issue related to this is that rounding while writing the .seq file can end up making unique shapes/events into duplicates, that are not handled well by the event library after reading the .seq file again (it is designed to be a library of unique entries). I noticed this issue before and also marked a problem caused by it here: https://github.com/imr-framework/pypulseq/blob/20cce84c00fa8ffd637685a44746e0910b655edd/pypulseq/event_lib.py#L222-L227 (when all is handled well, the if-statement following the comment should not be necessary, data should always be in the keymap).

In general we may need to use rounding in all event_lib entries, but that would be a more significant performance hit because find_or_insert is called many times. An alternative is to implement a function that detects duplicate events after rounding, and only call that once prior to writing the .seq file.

FrankZijlstra commented 4 months ago

I implemented something of a fix directly in event_lib, using the rounding used in sequence writing (ideally, not quite the same in the current implementation). See: https://github.com/FrankZijlstra/pypulseq/tree/event_lib_rounding (note that it is not complete, some more things need to be changed during sequence reading)

For a few long sequences I tested, the performance is around 40-70% slower, but it may strongly depend on the amount of near-duplicates. Using post-hoc finding of duplicates (after rounding) in the event libraries was about 10% slower.

imr-framework / pypulseq

"Identical" shapes not found in event_lib #169