labscript-suite / labscript-devices

A modular and extensible plugin architecture to control experiment hardware using the 𝘭𝘒𝘣𝘴𝘀𝘳π˜ͺ𝘱𝘡 𝘴𝘢π˜ͺ𝘡𝘦.
http://labscriptsuite.org
Other
5 stars 58 forks source link

Pb programming #71

Closed mjdoris closed 3 years ago

mjdoris commented 3 years ago

While troubleshooting an intermittent bug with the Pulseblaster an observation was made about extended load times when starting up a new unique shot. When attempting to program rather large instruction sets BLACs would hang upwards of 10-15 seconds. Although using the smart cache feature prevents this from occurring for any repeated shots of the same experiment after the first run, new experimental shots with different instructions results in an extended programming time again. To decrease the loading times the for loop was removed in transition_to_buffered and replaced with the map() function instead. This has shaved off seconds worth of programming time and enabled almost immediate loading of the shot.

I haven't perceived any drawbacks or issues with this so far. Since map() simply returns a list of the results after applying the specified function to each element in the iterable I'm assuming the functionality is essentially the same as the for loop (unless I'm mistaken somewhere).

philipstarkey commented 3 years ago

I don't think this is equivalent. In some basic testing with dummy data in a Python terminal, it appears that you would need to use the transpose of the pulse_program because map expects the arguments to each be a list of the values for each iteration of map. Instead I think you have unpacked the numpy array so that it iterates the number of times equal to the number of arguments for the pulseblaster function (rather than the number of instructions).

Now, obviously this should have raised some big errors...which I'm guessing it didn't in your test. However, in Python 3, calling map by itself doesn't actually evaluate the mapping. It creates a map object that needs to be iterated over to actually execute the function...so I think that maybe your code change is actually not programming the pulseblaster at all, and is instead just causing it to use whatever instructions were in it last time! You can confirm by wrapping the map call in list in order to force it to evaluate (although my quick testing shows that this might call the pulseblaster function once too many times, with the arguments being None...not sure why that is!)

Could you do some further testing to confirm if your code is doing what you expect or if I'm right?

mjdoris commented 3 years ago

I don't think this is equivalent. In some basic testing with dummy data in a Python terminal, it appears that you would need to use the transpose of the pulse_program because map expects the arguments to each be a list of the values for each iteration of map. Instead I think you have unpacked the numpy array so that it iterates the number of times equal to the number of arguments for the pulseblaster function (rather than the number of instructions).

Now, obviously this should have raised some big errors...which I'm guessing it didn't in your test. However, in Python 3, calling map by itself doesn't actually evaluate the mapping. It creates a map object that needs to be iterated over to actually execute the function...so I think that maybe your code change is actually not programming the pulseblaster at all, and is instead just causing it to use whatever instructions were in it last time! You can confirm by wrapping the map call in list in order to force it to evaluate (although my quick testing shows that this might call the pulseblaster function once too many times, with the arguments being None...not sure why that is!)

Could you do some further testing to confirm if your code is doing what you expect or if I'm right?

Thank you!! I was wondering if there were any big differences. I will do further testing to see why I'm not noticing a difference when running labscript. I did run multiple shot scripts, but maybe they were still too similar to notice. Is there any way to make this work? I'd really like to replace that for loop. I guess I will keep fiddling with it. Thank you for looking into it!!

chrisjbillington commented 3 years ago

It would surprise me if removing a for loop created a noticeable speedup when within the loop there's a function call - function calls are generally more costly than loop iterations.

If Python is the source of the slowness, to properly get rid of the loop and function calls, we'd need to write a function in C or cython that called the spincore functions in a loop within C.

But I suspect the actual USB communication is probably the bottleneck. How large an instruction set are we talking?

mjdoris commented 3 years ago

It would surprise me if removing a for loop created a noticeable speedup when within the loop there's a function call - function calls are generally more costly than loop iterations.

If Python is the source of the slowness, to properly get rid of the loop and function calls, we'd need to write a function in C or cython that called the spincore functions in a loop within C.

But I suspect the actual USB communication is probably the bottleneck. How large an instruction set are we talking?

An excessive amount that's probably not encountered often. I'm seeing the slowdown when doing between 15k-30k instructions. I think the one I'm using in the chip lab is being maxed out to 32k instructions