labscript-suite-temp / labscript

The labscript Python library provides a translation from simple Python code to complex hardware instructions. The library is used to construct a "connection table" containing information about what hardware is being used and how it is interconnected. Devices described in this connection table can then have their outputs set by using a range of functions, including arbitrary ramps.
BSD 2-Clause "Simplified" License
0 stars 0 forks source link

Recent GatedClocks-bugfix merge breaks (at least) Novatech DDS9m timing #51

Open philipstarkey opened 5 years ago

philipstarkey commented 5 years ago

Original report (archived issue) by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


Marked as critical because it’s in mainline and a timing issue in hardware many of us use.

I initially reported this as labscript_devices issue #35 (labscript-suite-temp/labscript_devices#35) thinking it was Novatech DDS9m specific…

So I reverted to beff2df5ac65166069b4fbbdb517a09e882c18ac, removing the GatedClocks-bugfix merge.

Then all is well!

Shots compile. And better, they run and produce nice fluoroscence images!

And indeed @Philip Starkey 's commit 7383d93dd60258e8a4a8b1bd788092932f833eba does seem to edit code in and around this error message…

See closed issue linked above for details…

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


I don’t believe this is actually a timing bug. I think that my fix for a timing bug has identified a region of your code that was not technically safe (although no doubt worked). The issue I fixed was ensuring that a clock tick that updated a Novatech was never cut short (to ensure it is always 100us long). It’s not that the Novatech is enforcing it’s clock limit on the digital flag, but that the interaction between the clockline for the novatech and the clockline for the digital flag must be taken into account when determining the set of instructions for the pseudoclock.

In your example in the other thread, you are attempting to change a digital flag 40us after a call to update the Novatech. As the pseudoclock is common between the Novatech and the device controlling the digital gate, this necessitates that the Novatech clock pulse is only 40us long (20us high, 20us low), which is a violation of it’s specified clock rate. Now, in the case of the Novatech, because the pulse only needs to be high for 100ns, and low for 100us, the previous code worked for you even if the pulse was shortened (because it was held low for the appropriate length of time when the clockline was not is use for subsequent instructions). But that’s a very device specific example of something that happened to work, even though we were violating our own rules. Other devices are not so forgiving, and this fix was to identify (and prohibit) timing issues that were causing instructions to be missed on the G99 apparatus.

The only ways to work around this are awful to implement:

The preferred solution would be to put your Novatechs on a separate pseudoclock. But I realise that may not be feasible for everyone.

I’m obviously not going to be able to work on this any time soon, so I assume someone else will be making the changes, but my preference would be to not remove these extra checks because they serve a purpose ensuring the ‘gated clocks’ feature does not bypass the specified (symmetric) clock rate of a device.

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


Thanks Phil for the detailed explanation.

This only makes sense to me though if the pseudoclock is common between the novatech table-advance line and the digital gate. But the digital gate is on an NI card, which I believe is on a different pseudoclock.

I am pretty sure, in fact, that the novatech table-advance clock line is a single-purpose pseudoclock - ie the pseudoclock goes straight from a pulseblaster flag to the DDS9m and nowhere else. But I will check this in the lab tomorrow.

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


The last commit of your connection table (8 months ago) shows the NI cards and novatech on separate clock lines (digital flags) but not pseudoclocks (pulseblasters). Gated clocks was the best we could do given that the pulseblasters flags could not be independently controlled with separate instructions, but it is unfortunately not equivalent to separate pseudoclocks.

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


The two devices don't share a 'clockline', but they do share a 'pseudoclock' if they are connected to the same PulseBlaster.

I am not completely following the problem (I will read through it more carefully later), but I wonder if it's fair to say that perhaps the novatech's min update interval is simply not 100us when using symmetric clock ticks. Would it make sense to set it to ~200us so that it can survive no matter what other devices on the pseudoclock are doing? This could be overridable so that when using asymmetric clock tick (which are implemented and work fine for PulseBlasters), one could set a shorter min update interval. Or the Novatech could inspect the pulse width duration of its pseudoclock and work out its min update interval based on this (this would require all pseudoclocks to declare their pulse width duration - which could be the string 'symmetric'. Perhaps the new error checking code should only run if pulse widths are symmetric.

I'll read more closely to make sure I understand what's going on, but something like the above seems like it could be the right direction for a solution.

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


I don't think it's as simple as that because you need to check whether the digital low length was achieved because the clockline was gated for one or more instructions on another clockline. Assymetric pseudoclocks pulses wouldn't actually fix Lincoln's ‘bug’, but widespread support for them throughout the labscript processing chain might allow you to not raise an error because you can tell that the clockline is held low for long enough.

And yes, technically the Novatech clocklimit should be halved for symmetric clock pulses if you go by the technical specs. @shjohnst and I have discussed it though and we were reluctant to change it given it seems to work…it will break a lot of people's code if we do change it.

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


Yes, wow. I had no idea our NI clock was constrained by the dopey novatech, but I guess it is. Both are on PB0.

I suppose nothing moves very fast on our NI AOs and DOs. Just coils etc.

Our analog inputs (NI) and alazartech analog inputs are off PB1, so I guess they tick faster.

I don’t have the stomach to move our NI card to a different PB and will just sit on the older rev for now.

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


It’s only constrained within 100us of Novatech instructions though, so you might have fast things in your labscript file that work because they are far away (temporally) from Novatech commands.

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


So a low-fuss solution would be to move the novatech enable line to a flag on PB1…? Do we support PB flags as DOs or only as pseudoclocks?

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


Pulseblasters support flags as DOs or clocklines. But it would have to be the NovaTech clockline that moved to PB1, not the enable line (which could stay on PB0)

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


But PB1 is running some things that need to be clocked fast at present… so surely the enable line should go there, and the novatech clock line should stay on PB0 where we can tolerate things being slowed down by it?

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


Sorry, I thought you were talking about the table enable line, not the DDS enable line. Yes, if you only have an issue with one DDS enable line being used too close to Novatech instructions, then moving it is a good idea. DigitalOuts (including DDS enable lines) can be connected to pulseblasters by using the pulseblaster_1.direct_outputs object (replace name as appropriate) as the parent (and then probably "flag 3" as the connection in your case)

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


Great, thanks Phil (and Chris!) for the super support here.

I’ll try this tomorrow and if it works with on the current labscript revision I’ll upvert it to tip and test it with the gated clocks patch.

philipstarkey commented 5 years ago

Original comment by Lincoln Turner (Bitbucket: lincolndturner, GitHub: lincolnturner).


No joy.

It wasn’t just the novatech enable line that was ticking too fast, there was also a camera trigger on another NI DO. Not enough spare PB flags to move all of them as I think our PB is limited to 4 flags due to a firmware ‘upgrade’ a long time ago.

I then thought about putting the NI trigger onto PB1. But labscript didn’t like this, because WaitsMonitor (using a NI DO attached to a ctr) has to be on the master PB (PB0).

Final thought was to put novatech table clock onto PB1 and make PB1 ‘slow’. The other flags run off it could be made slow without too much pain. But the DDSes on PB1 need to be fast as they run rf pulses. And I suspect DDS ‘tick rate’ is also limited to slowest clockline?

Can’t see how to make this ‘feature’ run in spinor lab…

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


The commit 7383d93dd60258e8a4a8b1bd788092932f833eba is backed out in all JQI/NIST labs, as shots reliably break with that change. Perhaps they unreliably break without the change, but unfortunately it looks like the cure is worse than the disease. We haven't stamped labscript with a release since then, so the issue isn't being experienced by everyone, but I think we shouldn't push out a release with this commit in it. How do you feel about it being reverted @philipstarkey? I don't know what the fix will be for the original issue, but I think we should back out the change even if we don't know what the alternative is.

philipstarkey commented 4 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


What about just changing the added exception to instead print a critical warning in the terminal so it doesn't prevent compilation?

Then we could work on a better long term solution which addresses the issue for the NovaTech (which is the only device having difficulty with this change AFAIK)

I'm reluctant to revert entirely because we were bitten by the bug this patch corrected and that bug manifests itself as missed instructions which is not something you want to hide (and is very hard to track down).

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Yeah, that sounds like a good idea! Better than removing it entirely.

You want to do it, or should I? It might be better as a print to stderr rather than using the warnings module, since warnings with the latter are suppressed after the first time they come up, and since runmanager is long-running (i.e. days) this could make the problem invisible.

philipstarkey commented 4 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


You had better do it I think. I’m swamped with work at the moment and am also behind with getting the BitBucket-GitHub exporter going (it’s coming along quite nicely, but still a few key things to do!)

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


No worries, will do!