sinara-hw / Booster

Modular 8-channel RF power amplifier
Other
16 stars 3 forks source link

Interlock detectors may not assert properly #369

Open ryan-summers opened 4 years ago

ryan-summers commented 4 years ago

During firmware development, it was discovered that the interlock threshold detectors may not properly operate as intended.

image

Because the comparator output is connected to the clock input, an interlock threshold is only latched when the power detector measurement rises above the interlock threshold.

If the power detector measurement were to always have been above the interlock threshold, no interlock will be tripped.

CC: @jordens

hartytp commented 4 years ago

If the power detector measurement were to always have been above the interlock threshold, no interlock will be tripped.

Yes, that's a bit of a gotcha, but I don't think it prevents the interlock from working correctly (but I may be missing something?).

How would one get into that state? I'm assuming the enable sequence is:

ryan-summers commented 4 years ago

Note that it likely takes extremely specific conditions to cause this to be a fault.

A pathological example would be: Interlock DAC outputting 0.0V thresholds, channel disabled (power detector output is ~40mV) channel becomes enabled allowing RF output Power detector output increases before the interlock threshold is set Interlock threshold configured

If the output power remains higher than the interlock threshold, the interlock will not trip.

jordens commented 4 years ago

370 for two specific cases where this is a problem.

hartytp commented 4 years ago

Right. There are a few corner cases that have to be avoided in the firmware design (e.g. by setting a sensible minimum DAC value as you point out), but so long as that's done the interlock should work as expected AFACIT.

To be clear though, I'm not against looking at ways of improving the interlock hardware design in future revisions if people want to make suggestions.

jordens commented 4 years ago

Merging the duplicate #370:

Currently, the comparators are connected to the clock inputs while the on_offn signal is connected to the asynchronous active low clear input. That's not a good design. The interlocks should always (i.e. asynchronously and level sensitive) assert and block the rf switch while they should only ever be cleared on falling on_offn (i.e. negative edge sensitive). With the current design, if the interlock thresholds remain asserted when on_offn rises again (e.g. because someone connected two amplifier outputs via a splitter and there is some mismatch: note that combining two amplifier outputs is common and convenient to suppress IMD, as is the mismatch in that case) you'd never trip the reverse power interlock. It also prevents safe testing of either the interlock thresholds (i.e. disable the switches, set the thresholds very low and hope for them to be triggered). It's not a huge risk but it's clearly wrong.

Introduced in https://github.com/sinara-hw/Booster/issues/192

Suggest to make the comparators an asynchronous set, and the on_offn a negative edge sensitive clear.

Debugging credits to @ryan-summers

hartytp commented 4 years ago

repeat of comment in #370:

With the current design, if the interlock thresholds remain asserted when on_offn rises again (e.g. because someone connected two amplifier outputs via a splitter and there is some mismatch: note that combining two amplifier outputs is common and convenient to suppress IMD, as is the mismatch in that case) you'd never trip the reverse power interlock

I don't think this is a valid example since the reverse power interlock is implemented in software rather than hardware (the two comparators are the input and output forward detectors IIRC).

IIRC there is a potential pathalogical circumstance where the input detector doesn't trip, but that shouldn't be too much of an issue since the output detector will still trip.

Anyway, :+1: to improving the design in the future so long as we prototype using existing hardware first

jordens commented 4 years ago

No. As the screenshot above shows, it's the reflected power interlock (but note that this is but one of several easy points of confusion because the schematics and layout reflect the high viscosity of this design).

If it was the input power, the problem would be worse: the output power interlock threshold would only trigger if (for some reason) configured to implicitly protect the input as well. I.e. if the output amp fails it won't do that anymore.

hartytp commented 4 years ago

No. As the screenshot above shows, it's the reflected power interlock.

Aah, yes, you're right -- I forgot we'd changed that (and apparently not got round to updating the net names/other annotations, making this rather confusing since the reverse power measurement trips the input overdrive net).

Anyway, I agree that we should improve this for a future revision, but I'm also can't see any pathological case that we could hit in the setups we designed this for (we won't gang a pair of outputs together, not that it's an invalid thing to do).

jordens commented 4 years ago

IMO a user can reasonably assume that multiple channels can be combined safely and that the interlocks will protect the device. I'm uncomfortable with reducing the severity of such clear bugs to "improvement" just because the one user doesn't expect to trigger them in their application, especially if there are credible use cases where these bugs have a good chance of showing up and reduce functionality/add risk. We'd have to write down a list of caveats and that makes it unnecessarily hard to argue that this is a robust and generic device. Sure, it's not an immediate and high risk, and we'll work around it in the new firmware. But it's still wrong.

ryan-summers commented 4 years ago

IMO a user can reasonably assume that multiple channels can be combined safely and that the interlocks will protect the device. I'm uncomfortable with reducing the severity of such clear bugs to "improvement" just because the one user doesn't expect to trigger them in their application, especially if there are credible use cases where these bugs have a good chance of showing up and reduce functionality/add risk. We'd have to write down a list of caveats and that makes it unnecessarily hard to argue that this is a robust and generic device. Sure, it's not an immediate and high risk, and we'll work around it in the new firmware. But it's still wrong.

I agree that this is undoubtedly a bug, but it is a bug that we can work around in firmware, which is a common approach to hardware flaws. I think it would be prudent to slate this for fixing, but as long as the issue is known, we can work around it to provide a safe end product to the user. The only downside is that we're offloading some of the safety from the hardware to the firmware.

gkasprow commented 4 years ago

well, we can add an OR gate which will bypass the DFF.

hartytp commented 4 years ago

@gkasprow we discussed this previously. I don't think that an OR gate is a good long-term solution because the OR gate allows the interlock to trip without latching which is problematic (particularly if we don't get the timing right and the DFF never trips, but the interlock just oscillates between enabled and disabled).

For existing hardware, the current solution is acceptable IMHO. For future revisions, what's wrong with @jordens proposed solution (subject to verifying in prototype hardware)?

gkasprow commented 4 years ago

Ok. Right.

gkasprow commented 2 years ago

Let's use JK flip flop

gkasprow commented 2 years ago

Here is proposed schematic obraz I replaced the comparators inputs to invert their outputs. The JK truth table is here obraz @jordens can you look at it?

jordens commented 2 years ago

Looks good to me. @hartytp could you sign off as well?

hartytp commented 2 years ago

Seems sensible, but this isn't something I'm particularly experienced with so I don't put too much weight on my opinion.