nightscout / Trio

MIT License
45 stars 125 forks source link

Critical Pod Fault 049 (0x31) Error When Resuming Insulin Delivery #244

Closed Sjoerd-Bo3 closed 3 weeks ago

Sjoerd-Bo3 commented 1 month ago

Describe the bug

The user (cj30x) encountered a Critical Pod Fault 049 (0x31) error, indicating an incorrect pod state for command or error during insulin command setup. This fault occurred when the user resumed insulin after suspending it due to low blood sugar and walking activities.

Attach an Issue Report

log_prev.txt log.txt

To Reproduce

Steps to reproduce the behavior:

  1. Suspend insulin delivery on the Trio app.
  2. Engage in physical activity or allow some time to pass.
  3. Resume insulin delivery on the Trio app.
  4. Observe if the Critical Pod Fault 049 (0x31) error occurs.

Expected behavior

The insulin delivery should resume without triggering a pod fault, allowing for continued use without needing to deactivate the pod.

Screenshots

image

Setup Information (please complete the following information):

Smartphone:

Pump:

CGM:

Trio Version:

Technical Details

The issue seems to be caused by the pumpManager code sending a command when the pod cannot accept it. This may indicate that the app had incorrect information about the pod’s suspension state.

Additional context

The user was experiencing frequent lows and had suspended insulin while walking with children at the seaside. The fault occurred when insulin was resumed after the suspension period. The user provided log files for further investigation.

Severity Level

Medium – This issue causes the pod to enter a fault state, requiring replacement, which is critical for managing insulin delivery and costs more pods

itsmojo commented 1 month ago

Summary - there were multiple issues which taken together resulted in the 0x31 pod fault:


Detailed analysis of the log_prev.txt data using OmniBLEParser with added comments describing various pumpManager issues causing various misbehaviors which eventually result in the 0x31 pod fault. Various commands/responses have not been included for clarity.

// Not shown - There was a 42 minute suspend starting at 18:53:54 followed by 7 resume/suspend commands being // issued over 21 seconds from 19:35:46 to 19:36:07. User reported that the Resume key didn’t appear to be working // and so she was pressing it multiple times. This apparent UI freeze / app lockup will hopefully will be improved in // the future by other potential app improvements (e.g., converting much of Loop to use async/await patterns) // and is not addressed in this pod comms analysis.

// Suspend command COMMAND: 2024-05-27 19:36:06 Message(1790414e seq:15 [CancelDeliveryCommand(nonce:494e532e, deliveryType:All, beepType:noBeepCancel)]) RESPONSE: 2024-05-27 19:36:07 Message(1790414e seq:00 [StatusResponse(deliveryStatus:Suspended, progressStatus:Normal, timeActive:8h4m, reservoirLevel:50+, insulinDelivered:35.60, bolusNotDelivered:0.00, lastProgrammingMessageSeqNum:15, alerts:No alerts)])

// Resume command which had no response that was mishandled and then started all the problems COMMAND: 2024-05-27 19:36:07 Message(1790414e seq:01 [SetInsulinScheduleCommand(nonce:494e532e, basalSchedule(currentSegment: 39, secondsRemaining: 1433, pulsesRemaining: 14, table: BasalDeliveryTable([InsulinTableEntry(segments:8, pulses:11, alternateSegmentPulse:false), InsulinTableEntry(segments:5, pulses:12, alternateSegmentPulse:true), InsulinTableEntry(segments:16, pulses:17, alternateSegmentPulse:false), InsulinTableEntry(segments:4, pulses:17, alternateSegmentPulse:false), InsulinTableEntry(segments:1, pulses:18, alternateSegmentPulse:false), InsulinTableEntry(segments:12, pulses:17, alternateSegmentPulse:true), InsulinTableEntry(segments:2, pulses:12, alternateSegmentPulse:true)]))), OmniBLEParser.BasalScheduleExtraCommand(blockType: OmniBLEParser.MessageBlockType.basalScheduleExtra, acknowledgementBeep: false, completionBeep: false, programReminderInterval: 0.0, currentEntryIndex: 3, remainingPulses: 119.0, delayUntilNextTenthOfPulse: 32.85714, rateEntries: [RateEntry(rate:1.1, duration:4h), RateEntry(rate:1.25, duration:2h30m), RateEntry(rate:1.7, duration:10h), RateEntry(rate:1.75, duration:6h30m), RateEntry(rate:1.25, duration:1h)])]) COMMAND: 2024-05-27 19:36:15 Message(1790414e seq:02 [GetStatusCommand(normal)])

// This response indicates that the basal command at 19:36:07 was not seen by the pod two different ways // (the deliveryStatus of Suspended and the lastProgrammingMessageSeqNum of 15 indicating that the last // command seen by the pod was the suspend command at 19:36:06 & so the resume at 19:36:07 failed) RESPONSE: 2024-05-27 19:36:16 Message(1790414e seq:03 [StatusResponse(deliveryStatus:Suspended, progressStatus:Normal, timeActive:8h4m, reservoirLevel:50+, insulinDelivered:35.60, bolusNotDelivered:0.00, lastProgrammingMessageSeqNum:15, alerts:No alerts)])

// Issue #1 — from an examination of PodCommSession.setBasalSchedule() it appears in this case the // current code will leave the podState with an uncertain UnfinalizedDose for a resume instead of using // the unacknowledgedCommand code used for boluses, temp basals, and cancels to resolve whether // the command succeeded or not. This causes the podState isSuspended var to incorrect indicate // the pod is not suspended when in fact it still is.

// The user did a 2.65U correction bolus with the rising BG’s after a long suspend. // The enactBolus should fail to issue the bolus with the suspended pod, // but because of issue #1, the podState incorrectly indicates that the pod // is not suspended and so this bolus is allowed to run. COMMAND: 2024-05-27 19:36:39 Message(1790414e seq:04 [SetInsulinScheduleCommand(nonce:494e532e, bolus(units: 2.65, timeBetweenPulses: 2.0, table: OmniBLEParser.BolusDeliveryTable(entries: [InsulinTableEntry(segments:1, pulses:53, alternateSegmentPulse:false)]))), BolusExtraCommand(units:2.65, timeBetweenPulses:2.0, extendedUnits:0.0, extendedDuration:0.0, acknowledgementBeep:false, completionBeep:false, programReminderInterval:0.0)])

// In this response, the deliveryStatus of “Priming” indicates an active bolus with no basal // (a suspended pod) which should only happening during priming, but it isn't fatal in itself. RESPONSE: 2024-05-27 19:36:39 Message(1790414e seq:05 [StatusResponse(deliveryStatus:Priming, progressStatus:Normal, timeActive:8h5m, reservoirLevel:50+, insulinDelivered:35.60, bolusNotDelivered:2.65, lastProgrammingMessageSeqNum:4, alerts:No alerts)])

// Additional verification that the pod is bolusing with no basal which should normally happen only during priming COMMAND: 2024-05-27 19:36:40 Message(1790414e seq:06 [GetStatusCommand(normal)]) RESPONSE: 2024-05-27 19:36:40 Message(1790414e seq:07 [StatusResponse(deliveryStatus:Priming, progressStatus:Normal, timeActive:8h5m, reservoirLevel:50+, insulinDelivered:35.60, bolusNotDelivered:2.65, lastProgrammingMessageSeqNum:4, alerts:No alerts)])

// With 2.4U left of the 2.65U bolus, another 0.5U bolus is attempted. // The 0.5U bolus with a pod that is reporting it is currently bolusing and suspended // should have be prevented by 3 different tests that all didn’t handle this condition correctly. // Since the 0.5U bolus is allowed with the 2.65U bolus still in progress, a 0x31 pod fault results. // Issue #2 - lastDeliveryStatusReceived variable should be used for additional verification for insulin delivery situations // Issue #3 - DeliveryStatus.suspended isn’t true for a suspend pod with an on-going bolus // Issue #4 - DeliveryStatus.bolusing isn’t true for an on-going bolus on a suspended pod // Additionally DeliveryStatus doesn’t handle all values that could be possibly be returned by a pod COMMAND: 2024-05-27 19:36:50 Message(1790414e seq:10 [SetInsulinScheduleCommand(nonce:494e532e, bolus(units: 0.5, timeBetweenPulses: 2.0, table: OmniBLEParser.BolusDeliveryTable(entries: [InsulinTableEntry(segments:1, pulses:10, alternateSegmentPulse:false)]))), BolusExtraCommand(units:0.5, timeBetweenPulses:2.0, extendedUnits:0.0, extendedDuration:0.0, acknowledgementBeep:false, completionBeep:false, programReminderInterval:0.0)]) RESPONSE: 2024-05-27 19:36:51 Message(1790414e seq:11 [PodInfoResponse(podInfoResponse, detailedStatus, ## DetailedStatus

As it turns out, if any of the 4 issues were handed correctly, there would have been no 0x31 pod fault in this case. Issues #3 & #4 are ultimately what resulted in 0x31 pod fault when there were two attempted boluses after a failed resume command that was mishandled. The changes for issues #3 & #4 are very obvious and extremely safe, the change for issue #2 is relatively trivial and very low risk (it’s just an additional verification step that can handle inconsistent delivery states) and the fix for issue #1 corrects the underlying problem that started this problem. I already have code written to fix all of these issues, but I need to do some testing before submitting the PR. These issues exist for both OmniBLE and OmniKit in all current implementations. Issues #3 & #4 have been there since the very beginning and issues #1 & #2 have probably existed since at least Loop 3.x. Don't know how all these deficiencies have slipped through the cracks for so long.

marionbarker commented 4 weeks ago

Overview

This issue needs to be addressed with three PR:

Parsed Information

I updated my Omnipod parser to work with the log and log_prev files from Trio as well as Loop. The csv file for the case that eventually led to the 0x31 and stimulated the modification proposed by OmniBLE PR 123 and OmniKit PR 35 in attached below.

This is the second pod in the log_prev.txt file. The csv file:

Details

This analysis matches what @itsmojo reported above, just in a slightly different format.

The first Basal Suspended returned by the pod is at 2024-05-27 18:53:55

There is a known bug in Trio where additional button presses get accepted even if the UI appears to freeze. (That issue is specific to Trio.)

The attempt to resume delivery at 2024-05-27 19:36:07 failed - no 0x1d message was received and the next 0x1d indicated the Basal was Suspended (the .priming state which normally never happens after pod setup completes). Because there was a missing unacknowledged command handler for this and because some of the logic to handle this unexpected pod state (no basal schedule loaded) was missing, the result is as @itsmojo said above.

No basal program was loaded, but Trio thought the pod was no longer suspended. (Loop would have made the same error):

bjornoleh commented 4 weeks ago

Please create a separate issue for the UI issue that allows multiple commands to be sent when the app hangs.

I would assume the same strategy as was used in https://github.com/nightscout/Trio/pull/248 could be used to address that issue. Perhaps there are other places where this is needed too?