Closed dfm closed 1 year ago
Ahh yes, good catch with the compilation. I'll try it on my computer and see if its faster than lal now. Thanks for the updates!
I lied in my original comment! I don't think it's faster after all :'( I had a bug in my own script! But I think we still want to make this change.
We could also add something like:
func = vmap(waveform)
func(theta_ripple)[0].block_until_ready()
start = time.time()
func(theta_ripple)[0].block_until_ready()
end = time.time()
print("Vmapped ripple waveform call takes: %.6f ms" % ((end - start) * 1000 / n))
To benchmark the vmapped version as well?
Yeah, I just tested on my computer and I still find that its about twice as slow in the loop. Yeah happy to add the benchmark for the mapped version. I have a bunch of RAM on my computer so should be able to run I think
I've added this code block to the benchmarking! On my computer it certainly helps, but still not quite as fast as the lal loop.
@tedwards2412 — As I thought, there were some issues with the benchmark code as written. The big one was that the JIT compilation was being included in the runtime which we don't want. I know that you thought you had already compiled it, but when you vmap it compiles again! Also, I'm not convinced that the thing you want to benchmark is the vmapped version of the function (in fact it just stalls out on my computer because I don't have enough RAM). Instead a better comparison is what I've done here and just directly loop over parameter values like we do for lal.
Once I do this, I find that ripple is actually somewhat faster than lal (at least on my computer)!(Edit: I lied - not faster! I just had a mistake in my script. But I still think this is the right way to benchmark!) Also we can't forget to include theblock_until_ready
call when benchmarking (see here for more info).