Open cbiffle opened 3 months ago
Interestingly, these glitches do not occur after the initial pin wiggle that the driver issues at machine startup. Only when it attempts to recover from a condition by generating additional wiggles.
They're the same code path, so, that's interesting.
This is another episode in the hunt for the Gimlet Disk Wumpus. See also #1821 #1822 #1823.
tl;dr: we're glitching the data lines on I2C again, and since this violates setup/hold, devices can respond to this in arbitrarily annoying ways.
Example trace:
This is just after we stop the bus reset process (which we started for no obvious reason, see #1823) and resume normal service. We are generating ~680ns negative glitches on both lines, which means that when we reconfigure the pins, we're taking them through low-state push-pull at least briefly on their way back to peripheral-controlled open-drain.
This event is not unique; I see a couple of glitches per minute on average in otherwise working Gimlet traces. (I think they are all related to bus resets.)
These glitches are lengthy enough to bypass the I2C standard glitch filter (50ns), but short enough that they violate the I2C spec's setup and hold requirements. (They're also illegal in the protocol state machine, but, so are a lot of things.) This could potentially trigger metastability or misbehavior of devices on the I2C bus.
We had behavior like this up until February 2023 which was resulting in bus lockups, discussed in #1126, so I would assume that this could cause bus lockups.