oxidecomputer / hubris

A lightweight, memory-protected, message-passing kernel for deeply embedded systems.
Mozilla Public License 2.0
3.02k stars 174 forks source link

reproducible T6 failure at boot since #1625

Closed wesolows closed 8 months ago

wesolows commented 8 months ago

On gimlet B15, which is very well known as my benchtop companion of long years and has never had any hardware issues, I noticed after a recent upgrade to 37ed81c8d02fc231bdb7df4c55f459e434196820 that the T6 failed to come up at all (i.e., in mfg mode). Further investigation showed that the sequencer is holding the NIC in reset with all PGs asserted and no faults:

wesolows at monolith in ~
$ humility spi -p 2 -n 64 --write 1,0,0 --read --discard 3
humility: attached via ST-Link V3
humility: SPI master is spi2_driver
             \/  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0x00000000 | 01 de 00 00 00 d7 b4 6d 04 8c 2d 58 cd fc 00 00 | .......m..-X....
0x00000010 | 00 0d 06 02 01 00 0c 04 01 00 0f 04 ff ff 05 03 | ................
0x00000020 | 7f ff 04 ff 00 0f ff 00 7f 36 ff ff 00 ff 00 00 | .........6......
0x00000030 | 0c ff 03 00 00 00 ff ff ff ff ff ff ff          | .............

We do not appear to have a bit in the register file following the sequencer's understanding of the PWREN input controlled by AMD firmware, but I do know that the host side PCIe control register looks ok:

gimlet-sn15 # pcieadm show-cfgspace -d 80/1/1 pcie.slotctl
PCI Express Capability (0x10)
  Slot Control: 0x103a
    |--> Attention Button Pressed Reporting: disabled (0x0)
    |--> Power Fault Detected Reporting: enabled (0x2)
    |--> MRL Sensor Changed Reporting: disabled (0x0)
    |--> Presence Detect Changed Reporting: enabled (0x8)
    |--> Command Complete Interrupt: enabled (0x10)
    |--> Hot Plug Interrupt Enable: enabled (0x20)
    |--> Attention Indicator Control: reserved (0x0)
    |--> Power Indicator Control: reserved (0x0)
    |--> Power Controller Control: power on (0x0)  <<<<
    |--> Electromechanical Interlock Control: 0x0
    |--> Data Link Layer State Changed: enabled (0x1000)
    |--> Auto Slot Power Limit: enabled (0x0)
    |--> In-Band PD: enabled (0x0)

I filed an RFE against the sequencer to add such a bit which would help rule out certain causes. However, I then found that this was reproducible; two subsequent boots with the same software exhibited the same behaviour. With that, I backed down to the last SHA before a few changes whose comments refer to the gimlet-seq task, specifically to a6c04477e38f4d108832310eea58baf8fcaf3cc8 and this began working again. To guard against coincidence, I booted again with the same software and again the T6 worked.

One thing I noticed, in light of the recent changes in this area, is that the state of the gimlet-seq task changed:

WORKING
$ humility tasks
humility: attached via ST-Link V3
system time = 5978
ID TASK                       GEN PRI STATE
 0 jefe                         0   0 recv, notif: fault timer(T+22)
 1 net                          0   5 recv, notif: eth-irq(irq61) wake-timer(T+297)
 2 sys                          0   1 recv
 3 spi2_driver                  0   3 recv
 4 i2c_driver                   0   3 recv
 5 spd                          0   2 notif: i2c1-irq(irq31/irq32)
 6 packrat                      0   1 recv
 7 thermal                      0   5 recv, notif: timer(T+321)
 8 power                        0   6 recv, notif: timer(T+859)
 9 hiffy                        0   5 notif: bit31(T+217)
10 gimlet_seq                   0   4 recv, notif: timer(T+10)
11 gimlet_inspector             0   6 notif: socket
12 hash_driver                  0   2 recv
13 hf                           0   3 recv
14 update_server                0   3 recv
15 sensor                       0   4 recv, notif: timer(T+28)
16 host_sp_comms                0   7 recv, notif: jefe-state-change usart-irq(irq82) multitimer control-plane-agent
17 udpecho                      0   6 notif: socket
18 udpbroadcast                 0   6 notif: bit31(T+307)
19 control_plane_agent          0   6 recv, notif: usart-irq(irq37) socket timer
20 sprot                        0   4 recv
21 validate                     0   5 recv
22 vpd                          0   4 recv
23 user_leds                    0   2 recv, notif: timer
24 dump_agent                   0   6 recv, notif: socket
25 sbrmi                        0   4 recv
26 idle                         0   8 RUNNING

to

NON-WORKING
$ humility tasks
humility: attached via ST-Link V3
system time = 3301023
ID TASK                       GEN PRI STATE
 0 jefe                         0   0 recv, notif: fault timer(T+77)
 1 net                         54   5 recv, notif: eth-irq(irq61) wake-timer(T+183)
 2 sys                          0   1 recv
 3 spi2_driver                  0   3 recv
 4 i2c_driver                   0   3 recv
 5 spd                          0   2 notif: i2c1-irq(irq31/irq32)
 6 packrat                      0   1 recv
 7 thermal                      0   5 recv, notif: timer(T+618)
 8 power                        0   6 recv, notif: timer(T+339)
 9 hiffy                        0   5 notif: bit31(T+57)
10 gimlet_seq                   0   4 recv
11 gimlet_inspector             0   6 notif: socket
12 hash_driver                  0   2 recv
13 hf                           0   3 recv
14 update_server                0   3 recv
15 sensor                       0   4 recv, notif: timer(T+983)
16 host_sp_comms                0   7 recv, notif: jefe-state-change usart-irq(irq82) multitimer control-plane-agent
17 udpecho                      0   6 notif: socket
18 udpbroadcast                 0   6 notif: bit31(T+416)
19 control_plane_agent          0   6 recv, notif: usart-irq(irq37) socket timer
20 sprot                        0   4 recv
21 validate                     0   5 recv
22 vpd                          0   4 recv
23 user_leds                    0   2 recv, notif: timer
24 dump_agent                   0   6 recv, notif: socket
25 sbrmi                        0   4 recv
26 idle                         0   8 RUNNING

At this point I began bisecting and found that the problem appears to have been introduced in 4cb9eddab7952ffa760762bfb2953b584696712f, and that the change in gimlet-seq task state is correlated with the change in functionality. I have not as yet formulated any hypothesis as to the root cause of the problem.

wesolows commented 8 months ago

Dump attached from this state. I don't see an I2CFault entry in the ringbuf but better to provide the whole thing than guess at what I might overlook. hubris.core.2.gz

rmustacc commented 8 months ago

I briefly took a test of the current tip of hubris master (4a6684576e05040dd3b3846bf81f66bb36b6cf9b) on c71 and confirmed that t6init failed because we never had a t6mfg instance appear, which seems fairly symptomatically in line with what @wesolows observed. Because this service is tied into normal networking coming up on this gimlet for me in the lab, I can confirm that this has never really happened to me while working on it on and off for a while in my memory.

rmustacc commented 8 months ago

I confirmed that I first see this with 4cb9eddab7952ffa760762bfb2953b584696712f applied.

hawkw commented 8 months ago

Dump attached from this state. I don't see an I2CFault entry in the ringbuf but better to provide the whole thing than guess at what I might overlook. hubris.core.2.gz

@wesolows which SHA is this dump running? is it https://github.com/oxidecomputer/hubris/commit/4a6684576e05040dd3b3846bf81f66bb36b6cf9b) or https://github.com/oxidecomputer/hubris/commit/4cb9eddab7952ffa760762bfb2953b584696712f?

hawkw commented 8 months ago

Interesting, there do seem to be some I2C errors in that dump...but they're in the drv_oxide_vpd ringbuf in the host_sp_comms task:

humility: ring buffer drv_i2c_devices::max31790::__RINGBUF in thermal:
humility: ring buffer drv_i2c_devices::sbrmi::__RINGBUF in sbrmi:
humility: ring buffer drv_oxide_vpd::__RINGBUF in gimlet_seq:
humility: ring buffer drv_oxide_vpd::__RINGBUF in host_sp_comms:
 NDX LINE      GEN    COUNT PAYLOAD
   0  147        1       10 Error(ErrorOnNext(HeaderCorrupt { stored_checksum: 0xffffffff, computed_checksum: 0x6b329f69 }))
   1   57        1        1 EepromError(I2cError(NoDevice))
   2  147        1        1 Error(ErrorOnNext(User(I2cError(NoDevice))))

I'm not sure if this is at all relevant, but it stood out to me as the only obvious error in any ringbuf in that dump...

wesolows commented 8 months ago

Dump attached from this state. I don't see an I2CFault entry in the ringbuf but better to provide the whole thing than guess at what I might overlook. hubris.core.2.gz

@wesolows which SHA is this dump running? is it 4a66845) or 4cb9edd?

$ rev=$(humility -d hubris.core.2 extract git-rev 2>/dev/null); printf '%s\n' $rev
4cb9eddab7952ffa760762bfb2953b584696712f

I have a core from 37ed81c8d02fc231bdb7df4c55f459e434196820 (HEAD) also if that helps; it's attached. hubris.core.0.gz

hawkw commented 8 months ago

Hmm, looks like the same I2C errors from drv_oxide_vpd in host_sp_comms are also present in hubris.core.0.

hawkw commented 8 months ago

It looks like i2c_driver has recorded a bunch of errors in the working core that actually aren't present in the broken core.

hubris.core.0:

humility: ring buffer drv_stm32xx_i2c_server::__RINGBUF in i2c_driver:
 NDX LINE      GEN    COUNT PAYLOAD
  27  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
  28  465        1        1 Error(0x27, BusLocked)
  29  252        1        1 Reset((I2C3, PortIndex(0x0)))
  30  465        1        1 Error(0x27, BusLocked)
  31  252        1        1 Reset((I2C3, PortIndex(0x0)))
  32  465        1        1 Error(0x29, BusLocked)
  33  252        1        1 Reset((I2C3, PortIndex(0x0)))
  34  465        1        1 Error(0x10, BusLocked)
  35  252        1        1 Reset((I2C4, PortIndex(0x0)))
  36  465        1        1 Error(0x24, BusLocked)
  37  252        1        1 Reset((I2C3, PortIndex(0x0)))
  38  465        1        1 Error(0x27, BusLocked)
  39  252        1        1 Reset((I2C3, PortIndex(0x0)))
  40  465        1        1 Error(0x14, BusLocked)
  41  252        1        1 Reset((I2C4, PortIndex(0x0)))
  42  465        1        1 Error(0x20, BusLocked)
  43  252        1        1 Reset((I2C4, PortIndex(0x0)))
  44  465        1        1 Error(0x24, BusLocked)
  45  252        1        1 Reset((I2C3, PortIndex(0x0)))
  46  465        1        1 Error(0x27, BusLocked)
  47  252        1        1 Reset((I2C3, PortIndex(0x0)))
  48  465        1        1 Error(0x20, BusLocked)
  49  252        1        1 Reset((I2C4, PortIndex(0x0)))
  50  465        1        1 Error(0x4a, BusLocked)
  51  252        1        1 Reset((I2C2, PortIndex(0x1)))
  52  262        1        1 ResetMux(0x70)
  53  262        1        1 ResetMux(0x71)
  54  262        1        1 ResetMux(0x72)
  55  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
  56  465        1        1 Error(0x19, BusLocked)
  57  252        1        1 Reset((I2C3, PortIndex(0x0)))
  58  465        1        1 Error(0x24, BusLocked)
  59  252        1        1 Reset((I2C3, PortIndex(0x0)))
  60  465        1        1 Error(0x27, BusLocked)
  61  252        1        1 Reset((I2C3, PortIndex(0x0)))
  62  465        1        1 Error(0x67, BusLocked)
  63  252        1        1 Reset((I2C4, PortIndex(0x0)))
  64  465        1        1 Error(0x27, BusLocked)
  65  252        1        1 Reset((I2C3, PortIndex(0x0)))
  66  465        1        1 Error(0x29, BusLocked)
  67  252        1        1 Reset((I2C3, PortIndex(0x0)))
  68  465        1        1 Error(0x10, BusLocked)
  69  252        1        1 Reset((I2C4, PortIndex(0x0)))
  70  465        1        1 Error(0x20, BusLocked)
  71  252        1        1 Reset((I2C4, PortIndex(0x0)))
  72  465        1        1 Error(0x20, BusLocked)
  73  252        1        1 Reset((I2C4, PortIndex(0x0)))
  74  465        1        1 Error(0x49, BusLocked)
  75  252        1        1 Reset((I2C2, PortIndex(0x1)))
  76  262        1        1 ResetMux(0x70)
  77  262        1        1 ResetMux(0x71)
  78  262        1        1 ResetMux(0x72)
  79  465        1        1 Error(0x67, BusLocked)
  80  252        1        1 Reset((I2C4, PortIndex(0x0)))
  81  465        1        1 Error(0x24, BusLocked)
  82  252        1        1 Reset((I2C3, PortIndex(0x0)))
  83  465        1        1 Error(0x20, BusLocked)
  84  252        1        1 Reset((I2C4, PortIndex(0x0)))
  85  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
  86  465        1        1 Error(0x4a, BusLocked)
  87  252        1        1 Reset((I2C2, PortIndex(0x1)))
  88  262        1        1 ResetMux(0x70)
  89  262        1        1 ResetMux(0x71)
  90  262        1        1 ResetMux(0x72)
  91  392        1        1 MuxError(BusLockedMux)
  92  252        1        1 Reset((I2C2, PortIndex(0x1)))
  93  262        1        1 ResetMux(0x70)
  94  262        1        1 ResetMux(0x71)
  95  262        1        1 ResetMux(0x72)
  96  465        1        1 Error(0x1f, BusLocked)
  97  252        1        1 Reset((I2C3, PortIndex(0x0)))
  98  465        1        1 Error(0x24, BusLocked)
  99  252        1        1 Reset((I2C3, PortIndex(0x0)))
 100  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
 101  465        1        1 Error(0x67, BusLocked)
 102  252        1        1 Reset((I2C4, PortIndex(0x0)))
 103  465        1        1 Error(0x27, BusLocked)
 104  252        1        1 Reset((I2C3, PortIndex(0x0)))
 105  465        1        1 Error(0x20, BusLocked)
 106  252        1        1 Reset((I2C4, PortIndex(0x0)))
 107  465        1        1 Error(0x4a, BusLocked)
 108  252        1        1 Reset((I2C2, PortIndex(0x1)))
 109  262        1        1 ResetMux(0x70)
 110  262        1        1 ResetMux(0x71)
 111  262        1        1 ResetMux(0x72)
 112  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
 113  465        1        1 Error(0x27, BusLocked)
 114  252        1        1 Reset((I2C3, PortIndex(0x0)))
 115  465        1        1 Error(0x27, BusLocked)
 116  252        1        1 Reset((I2C3, PortIndex(0x0)))
 117  465        1        1 Error(0x48, BusLocked)
 118  252        1        1 Reset((I2C2, PortIndex(0x1)))
 119  262        1        1 ResetMux(0x70)
 120  262        1        1 ResetMux(0x71)
 121  262        1        1 ResetMux(0x72)
 122  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
 123  465        1        1 Error(0x67, BusLocked)
 124  252        1        1 Reset((I2C4, PortIndex(0x0)))
 125  465        1        1 Error(0x29, BusLocked)
 126  252        1        1 Reset((I2C3, PortIndex(0x0)))
 127  465        1        1 Error(0x20, BusLocked)
 128  252        1        1 Reset((I2C4, PortIndex(0x0)))
 129  465        1        1 Error(0x20, BusLocked)
 130  252        1        1 Reset((I2C4, PortIndex(0x0)))
 131  465        1        1 Error(0x4a, BusLocked)
 132  252        1        1 Reset((I2C2, PortIndex(0x1)))
 133  262        1        1 ResetMux(0x70)
 134  262        1        1 ResetMux(0x71)
 135  262        1        1 ResetMux(0x72)
 136  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
 137  465        1        1 Error(0x1d, BusLocked)
 138  252        1        1 Reset((I2C4, PortIndex(0x0)))
 139  465        1        1 Error(0x67, BusLocked)
 140  252        1        1 Reset((I2C4, PortIndex(0x0)))
 141  465        1        1 Error(0x24, BusLocked)
 142  252        1        1 Reset((I2C3, PortIndex(0x0)))
 143  465        1        1 Error(0x48, BusLocked)
 144  252        1        1 Reset((I2C2, PortIndex(0x1)))
 145  262        1        1 ResetMux(0x70)
 146  262        1        1 ResetMux(0x71)
 147  262        1        1 ResetMux(0x72)
 148  190        1        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
 149  465        1        1 Error(0x27, BusLocked)
 150  252        1        1 Reset((I2C3, PortIndex(0x0)))
 151  465        1        1 Error(0x29, BusLocked)
 152  252        1        1 Reset((I2C3, PortIndex(0x0)))
 153  465        1        1 Error(0x14, BusLocked)
 154  252        1        1 Reset((I2C4, PortIndex(0x0)))
 155  465        1        1 Error(0x20, BusLocked)
 156  252        1        1 Reset((I2C4, PortIndex(0x0)))
 157  465        1        1 Error(0x29, BusLocked)
 158  252        1        1 Reset((I2C3, PortIndex(0x0)))
 159  465        1        1 Error(0x10, BusLocked)
 160  252        1        1 Reset((I2C4, PortIndex(0x0)))
 161  465        1        1 Error(0x67, BusLocked)
 162  252        1        1 Reset((I2C4, PortIndex(0x0)))
 163  465        1        1 Error(0x27, BusLocked)
 164  252        1        1 Reset((I2C3, PortIndex(0x0)))
 165  465        1        1 Error(0x29, BusLocked)
 166  252        1        1 Reset((I2C3, PortIndex(0x0)))
 167  465        1        1 Error(0x14, BusLocked)
 168  252        1        1 Reset((I2C4, PortIndex(0x0)))
 169  465        1        1 Error(0x10, BusLocked)
 170  252        1        1 Reset((I2C4, PortIndex(0x0)))
 171  465        1        1 Error(0x20, BusLocked)
 172  252        1        1 Reset((I2C4, PortIndex(0x0)))
 173  465        1        1 Error(0x20, BusLocked)
   0  252        2        1 Reset((I2C4, PortIndex(0x0)))
   1  465        2        1 Error(0x48, BusLocked)
   2  252        2        1 Reset((I2C2, PortIndex(0x1)))
   3  262        2        1 ResetMux(0x70)
   4  262        2        1 ResetMux(0x71)
   5  262        2        1 ResetMux(0x72)
   6  190        2        1 MuxUnknownRecover((I2C2, PortIndex(0x1)))
   7  465        2        1 Error(0x4a, BusLocked)
   8  252        2        1 Reset((I2C4, PortIndex(0x0)))
   9  465        2        1 Error(0x49, BusLocked)
  10  252        2        1 Reset((I2C4, PortIndex(0x0)))
  11  465        2        1 Error(0x29, BusLocked)
  12  252        2        1 Reset((I2C3, PortIndex(0x0)))
  13  465        2        1 Error(0x19, BusLocked)
  14  252        2        1 Reset((I2C3, PortIndex(0x0)))
  15  465        2        1 Error(0x20, BusLocked)
  16  252        2        1 Reset((I2C4, PortIndex(0x0)))
  17  465        2        1 Error(0x24, BusLocked)
  18  252        2        1 Reset((I2C3, PortIndex(0x0)))
  19  465        2        1 Error(0x20, BusLocked)
  20  252        2        1 Reset((I2C4, PortIndex(0x0)))
  21  465        2        1 Error(0x48, BusLocked)
  22  252        2        1 Reset((I2C4, PortIndex(0x0)))
  23  465        2        1 Error(0x1f, BusLocked)
  24  252        2        1 Reset((I2C3, PortIndex(0x0)))
  25  465        2        1 Error(0x27, BusLocked)
  26  252        2        1 Reset((I2C3, PortIndex(0x0)))

hubris.core.2:

humility: ring buffer drv_stm32xx_i2c_server::__RINGBUF in i2c_driver:
 NDX LINE      GEN    COUNT PAYLOAD
   0  644        1        4 Wiggles(0x0)

I'm not totally sure what that indicates, especially given that none of the ringbuf entries that gimlet_seq should be recording when it sees an I2C error are there...

wesolows commented 8 months ago

FWIW, a core from the last working SHA; the drv_oxide_vpd ringbuf contains the same errors. I don't know for sure but I would guess it always has. This machine does have valid programmed VPD on the baseboard but the sharkfind and fan VPD do not. While I can program them, this seems like a surprising failure mode given the valid baseboard VPD:

$ humility vpd -l -r
humility: attached via ST-Link V3
ID  C P  MUX ADDR DEVICE        DESCRIPTION               LOCKED
 0  2 F  1:1 0x50 at24csw080    U.2 Sharkfin A VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 1  2 F  1:2 0x50 at24csw080    U.2 Sharkfin B VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 2  2 F  1:3 0x50 at24csw080    U.2 Sharkfin C VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 3  2 F  1:4 0x50 at24csw080    U.2 Sharkfin D VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 4  2 F  2:1 0x50 at24csw080    U.2 Sharkfin E VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 5  2 F  2:2 0x50 at24csw080    U.2 Sharkfin F VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 6  2 F  2:3 0x50 at24csw080    U.2 Sharkfin G VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 7  2 F  2:4 0x50 at24csw080    U.2 Sharkfin H VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 8  2 F  3:1 0x50 at24csw080    U.2 Sharkfin I VPD        unlocked
 |
 +--> <VPD appears to be unprogrammed>

 9  2 F  3:2 0x50 at24csw080    U.2 Sharkfin J VPD        <NotPresent>
 |
 +--> <failed to read at offset 0>

10  2 F  3:4 0x50 at24csw080    Gimlet VPD                unlocked
 |
 +--> [("FRU0",[("BARC",["0XV1:9130000019:000:EVT18220015"]),("MAC0",[[168,64,37,1,1,6,8,0,8]])])]

11  2 B  1:3 0x50 at24csw080    Fan VPD                   unlocked
 |
 +--> <VPD appears to be unprogrammed>

hubris.core.3.gz

hawkw commented 8 months ago

It looks like the gimlet_seq task in hubris.core.2 doesn't seem to be in the "time to die" loop it goes into when it's seen too many failures; it's in idol_runtime::dispatch:

     Running `target/debug/humility -d /home/eliza/Downloads/hubris.core.2 tasks --stack gimlet_seq`
humility: attached to dump
system time = 965080
ID TASK                       GEN PRI STATE
10 gimlet_seq                   0   4 recv
   |
   +--->  0x2403c430 0x0806c92e userlib::sys_recv_stub
          0x2403c640 0x08069962 userlib::sys_recv
          0x2403c640 0x08069962 idol_runtime::dispatch
          0x2403c640 0x08069972 main

It seems like I've broken something that isn't actually due to the change to die permanently on errors rather than being restarted (the core change in https://github.com/oxidecomputer/hubris/commit/4cb9eddab7952ffa760762bfb2953b584696712f)...but it's not yet obvious to me what.

wesolows commented 8 months ago

It looks like i2c_driver has recorded a bunch of errors in the working core that actually aren't present in the broken core.

hubris.core.0:

Both .2 and .0 are broken; .3 is good. Here are the revisions I get from each of them:

$ for f in 0 2 3; do rev=$(humility -d hubris.core.$f extract git-rev 2>/dev/null); printf '%s\n' $rev; done
37ed81c8d02fc231bdb7df4c55f459e434196820
4cb9eddab7952ffa760762bfb2953b584696712f
a6c04477e38f4d108832310eea58baf8fcaf3cc8
hawkw commented 8 months ago

Oh, my bad, I hadn't realized those were both broken. Thanks for clearing that up.

hawkw commented 8 months ago

Yesterday, @wesolows determined that the root cause of this issue was an accidental change from idol_runtime::dispatch_n to idol_runtime::dispatch introduced in resulting in the sequencer task ignoring its timer notifications, and thus not polling the FPGA after it reaches A0, so the T6 never gets powered on. PR #1627 fixes this (changing back to dispatch_n), and I've verified that after applying this patch, the T6 once again comes up as expected.