betrusted-io / xous-core

The Xous microkernel
Apache License 2.0
530 stars 85 forks source link

system instability with network traffic during suspend/resume operation #467

Open bunnie opened 9 months ago

bunnie commented 9 months ago

Setup:

Result:

There are likely two issues at play here, but more investigation is needed.

  1. The lack of EC response is probably related to a COM packet being interrupted by suspend/resume. This shouldn't be possible because COM should request S/R deferral until an atomic operation is finished, but likely we're missing a fence around some un-suspendable operation (maybe a split tx/rx pair)
  2. The RTC failure is probably because a write was issued to update the RTC with a new time offset but then a suspend call interrupts it. This would cause spurious data to be written to the RTC as the system powers down and the RTC is unlocked for writing

Not delaying 0.9.15 on account of this bug, but was found during release testing for 0.9.15.

bunnie commented 9 months ago

there also seems to be some possible conflicts if codec and RTC try to use the I2C bus together. There is a lock on the I2C bus, but it might not be sufficiently broad (it's only on the smallest transaction basis, but perhaps the bus should be locked for the entire duration of a logical operation, i.e., during codec init it should lock the bus until the sequence is done?)