Closed FrankZijlstra closed 9 months ago
Thanks for these nice improvements @FrankZijlstra I will try to review the changes as fast as possible.
As you are becoming a pretty active developer, I would like to invite you to our pypulseq Slack channel (check https://github.com/imr-framework/pypulseq/discussions/97 for more info). Would be great if you would also check the other discussions about our planned refactoring (https://github.com/imr-framework/pypulseq/discussions/123, https://github.com/imr-framework/pypulseq/discussions/124).
As you are becoming a pretty active developer, I would like to invite you to our pypulseq Slack channel (check #97 for more info). Would be great if you would also check the other discussions about our planned refactoring (#123, #124).
Would be happy to join, but I don't seem to be able to with the links in #97 (wrong email domain)
Would be happy to join, but I don't seem to be able to with the links in #97 (wrong email domain)
maybe @sravan953 can post/send you an updated link !?
@FrankZijlstra Echoing @schuenke 's sentiment, thanks for all the improvements! We'd love for you to join the PyPulseq channel, please join the workspace here and I will add you to the channel -- https://join.slack.com/t/iarg-cmrrc/shared_invite/zt-1t3s2u8lz-1EJt3ufWxYvzEab0cZyfdA
Hi @FrankZijlstra,
Is there a reason why you are using tuple()
instead of .tobytes()
as in find_or_insert()
function:
if not isinstance(new_data, np.ndarray):
new_data = np.array(new_data)
data_string = new_data.tobytes()
AFAIK, directly dumping the bytes of the array should be much faster than conversion to tuple.
@btasdelen Interesting, you are quite right that tobytes
is indeed faster. The reason I changed everything to tuple
s is that the keys used in insert
, find_and_insert
, and update
need to be the same, and the old code in insert
(converting floats to strings) was really slow. I did just now implement the obvious alternative by making sure everything is a np.array
and then using tobytes
as keys, and that is faster.
However, there is one more thing I noticed the other day: for all except the shape_library
the data inserted are small, newly constructed arrays. For example in register_grad_event
:
data = [amplitude, *shape_IDs, event.delay, event.first, event.last]
and
data = np.array(
[
event.amplitude,
event.rise_time,
event.flat_time,
event.fall_time,
event.delay,
]
)
So we're constructing a list
, converting it to np.array
, and then using tobytes
on that to insert it in the library. By changing all these to be constructed as tuple
s in the first place, a lot of conversions can be avoided. And the way the libraries are used, it does not seem to care that the data inserted is a tuple
instead of a np.array
(all access to library data is using straight indexing, i.e. library.data[key][1]
).
The only problem with this is the shape library, which does insert np.array
s, but this could easily be made into a special case that does use the tobytes
keys. Either way, I'll commit the alternative version using tobytes
for now.
I added some more things to the PR, sorry for piling things on (but I checked with @schuenke that this was okay :) ).
I reworked the event_lib
to allow choosing between tuple
or tobytes
keys (latter only used for shape_library
), and made sure all calls that insert
into the event libraries use tuple
s whenever possible to prevent conversions/copies. As an added benefit, tuple
s are immutable, so you can't accidentally modify a library. To be consistent, I set the writeable
flag of all numpy data in the library to false.
Other smaller improvements:
np.max
I missed beforeregister_rf_event
np.array
conversionsblock_to_events
to prevent making new lists every call to set_block
(which will almost always be with a list of events, not a block). The hasattr(args[0], 'rf')
check is consistent with the MATLAB version of this function.I tested these commits with a couple of more complicated sequences. I did not observe any regressions. I am merging before the PR gets even bigger :)
max(a,b)
instead ofnp.max([a,b])
).np.cumsum
calls that used small, newly constructed lists with a native Pythoncumsum
, specialized for up to 5 values (the maximum use-case found in pypulseq).get_block
to prevent expensive recalculations when blocks are repeatedly accessed.event_lib
andtest_report
. This also fixes a bug whereevent_lib.find_or_insert
would not find an existing event because a different key was used.event_lib.find
usenext_free_ID
instead of recalculating this value every time.compress_shape
,add_gradients
, andtest_report
.Overall performance improvements range from modest to 3-4x faster, depending on the type of sequence, and whether additional functions like
test_report
are used.The
event_lib.find_or_insert
bug means that the produced.seq
files after this PR are not identical todev
, as duplicated events are now not there, changing event id's along with it. But other than that, as far as I could tell the changes do not break anything. It would be helpful to check this with the unit tests (should the sequence files produced by MATLAB be included in the repository to make this easier?).