mddub / urchin-cgm

A graph of your CGM data on a Pebble watch.
MIT License
56 stars 46 forks source link

Losing BT connectivity...but maybe not? #22

Open Spazholio opened 8 years ago

Spazholio commented 8 years ago

So my watchface is currently showing that it has no BT connectivity, and hasn't for the last 13 minutes. However, in that time I've received several email and FB notifications, telling me that there is BT connectivity. Are notifications sent differently than how the Urchin communicates with the phone?

Spazholio commented 8 years ago

Incidentally, it's happening every 5 mins. And when it does, while the other notifications come in, my Urchin doesn't update. It's giving me anxiety, yo. =)

mddub commented 8 years ago

Thanks for the report. I suspect this is the same connection issue mentioned in #21. Are you on iPhone or Android?

The "no BT" icon really means "the phone has not sent back a response to a data request since N minutes ago". I've experienced this intermittently, as well, but only when showing certain data.

My best hypothesis is that the JS interpreter in the iOS Pebble app locks up when processing a lot of data (a memory leak?), so it stops responding to requests. Here are some things I've observed:

Here's my first attempt at a fix: TODO: increase max app message size, change map/filter magic to for loops, reduce data cached

Can you give that a try? And if you see a suspicious "no BT" indicator, would you mind dumping logs from the Pebble app and emailing them to me?

@StephenBrown2: does this description line up with the connection trouble you mentioned in #21? @jasoncalabrese: have you ever seen "no BT" while showing OpenAPS status + basals on Android?

Thanks in advance for the help!

mddub commented 8 years ago

Heh, oops, fat-fingered the enter button there. I'm behind the Great Firewall with flaky VPN, so can't verify my fix, nor push to GitHub apparently. I'll be back on Monday and will put up the PR then. Meanwhile would appreciate help diagnosing if you see it again.

jasoncalabrese commented 8 years ago

I don't have many connection issues with Android, but my wife uses iOS and almost always has the no BT icon. She also leaves the phone in 1 spot at home so it goes out of range more than mine. I know the old NS watchfaces had similar issues, but they might try harder to recover.

Spazholio commented 8 years ago

Right after I posted this, I kinda sorta got fed up and deleted everything Pebble from my phone, and re-set it up from scratch. That's improved things quite a bit. Last night I noticed it happened once, but all I did was open the Pebble app, then open the Urchin settings and hit save. That worked well.

But if it happens often, I'll definitely dump logs and get 'em to you.

Pogman commented 8 years ago

Happens for me on Android when the phone is in range but uploader is not updating NS site.

tanja3981 commented 8 years ago

I have the same problem on my android 6.0.1 phone, too. I think this happens only since one of the last pebble updates, at least I haven't noticed before.

Is there anything I can provide to find the cause of this?

Spazholio commented 8 years ago

@tanja3981 Have you tried setting it up from scratch? After the last Pebble update, I did that and it fixed those issues right up.

tanja3981 commented 8 years ago

@spazholio Unfortunately, I did, as I got a new phone a week ago. However, the problem persists. :-(

StephenBrown2 commented 8 years ago

@mddub Yes, it sounds like it could be what you hypothesize... a memory leak makes sense that would be refreshed on relaunch of the pebble app.

mddub commented 8 years ago

For those still experiencing this issue (which includes me 1-2x/week): Thanks for your patience. In #27, I've added a bunch of small fixes and a big improvement in how loading/error states are displayed. Even if it doesn't fix the issue, it will certainly give more feedback on what's going wrong.

Give the latest release a try and let me know if you see an improvement, or what kind of error messages you see on the watch.

@tanja3981: How exactly does the error manifest for you? Which data are you showing in the status bar? Are you showing bolus/basal history in the graph?

tanja3981 commented 8 years ago

@mddub I'm using 'none' statusbar and showing no bolus nor basal history. Whenever I'm getting the error, I got a strikethrough bluetooth icon. I just downloaded the new version. When it occurs next I can make a photo of the error in the watch. Shall I collect something else, too?

Pogman commented 8 years ago

Only used the latest version briefly today and it can still lose connection (good to see it trying though cheers!). What may be happening is that I go out of range of the phone for a fair while and after a time pebble on the phone kills the phone handler part of Urchin and then when I finally get back in range it just goes into a trying to connect/timeout loop but nothing is listening and the only way to get it going again is to hop in/out of menus on the watch.

tanja3981 commented 8 years ago

After two days watching I can confirm @Pogman threory: The bluetooth connection is missing when leaving the range of the phone (even for a very short) period of time and not reconnecting without interventation. I did not have any connection problems as long as staying in the phones range.

mddub commented 8 years ago

Thanks for your help with this info, @Pogman and @tanja3981. I just pushed another attempt at a fix: #29.

mddub commented 8 years ago

Please give the latest release a try. If it doesn't work, this would be super helpful:

Would also love to know what the watch is showing in the top-left of the graph when it can't connect. Is it always No BT?

jasoncalabrese commented 8 years ago

Just installed your latest build on my watch and will update our other pebbles soon.

tanja3981 commented 8 years ago

Thanks a lot for your effort! I just installed the new version and will let you know what I experience!

tanja3981 commented 8 years ago

Sent you that support mail as suggested, I hope it helps!

Pogman commented 8 years ago

No BT icon --> Refresh icon --> "timed out" --> loop to no BT with minutes since last connect.

Sent you logs.

jasoncalabrese commented 8 years ago

A huge improve on my wife's iPhone, since she leaves the phone in place while at home it goes in and out of range a lot, but so far every time we check the watchface has current data.

mddub commented 8 years ago

Thank you so much for these logs @tanja3981 and @Pogman. Super helpful. They both show something interesting which I may need to report to Pebble, since it occurs at a level my app can't fix.

For now, though, there is one last thing I'd like to try. Please install the latest release. To be extra sure you get the latest, rather than your browser's cached version, download directly from here:

https://github.com/mddub/urchin-cgm/raw/d47d82/release/urchin-cgm.pbw

Let me know what you see with this version. If the problem persists and you can send me one more set of logs when it happens, I've got solid data to send to Pebble so they can help debug the issue.

(Details: After Bluetooth successfully reconnects, the watch and phone exchange several messages. One message sent by the watch is to endpoint 49, which is reserved for telling the phone app about "app launch" events. But both your logs show the phone receiving this message and reporting that this endpoint is not registered. Some digging around suggests that this endpoint was deprecated between SDK 2 and 3, but since my last change was to upgrade from SDK 3.7 to 3.14, the watchface shouldn't be sending it. Or if it should, maybe the Pebble Android app shouldn't be dropping it. My own logs show the Pebble iOS app handling it without error. I'm not certain this is the cause of the issue - the problem may be somewhere else, which #31 will clarify.)

mddub commented 8 years ago

@jasoncalabrese that's great to hear! Thanks for the confirmation for iOS.

Pogman commented 8 years ago

Cool and thanks for sharing your findings (I'm new to Pebble and Android and would like to mess with dev on both at some point).

Installed the version here and will keep an eye on it.

Pogman commented 8 years ago

Sent logs from latest version

beached commented 7 years ago

It isn't just Urchin on the pebble classic at least. All the CGM watch faces seem to have BT connectivity issues on the latest Pebble OS. I downgraded a few days back to 2.9.1 and have not had a single issue since. Had to go back a few patches in Urchin but it does work

tanja3981 commented 7 years ago

I don't have this problem with other faces. Spark and xDrip face are working fine for me. However, if possible, I'd prefer urchin as it's much nicer.

Thasgolas commented 7 years ago

forgive my newbishness.... How do I get/use the latest urchin-cgm release? Any "easy" way?

On Tue, Aug 9, 2016 at 12:37 PM, Tanja notifications@github.com wrote:

I don't have this problem with other faces. Spark and xDrip face are working fine for me. However, if possible, I'd prefer urchin as it's much nicer.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mddub/urchin-cgm/issues/22#issuecomment-238630123, or mute the thread https://github.com/notifications/unsubscribe-auth/APzPDjP01LvITBThfc07Qjx954FaYCW8ks5qeLrQgaJpZM4JINKJ .


Fight back spam! Download the Blue Frog. http://www.bluesecurity.com/register/s?user=bWFyazE4MDk%3D

Pogman commented 7 years ago

@Thasgolas In the Pebble app on your phone go to settings and enable developer mode. You can then install Urchin via a web browser on your phone using the .pbw file a few posts up in this thread.

mddub commented 7 years ago

Can folks still experiencing this issue share the error message which appears briefly once per minute in the top-left of the graph, and under what circumstances it happens (for example, only when the watch goes out of range of the phone)?

Turns out most of my theories have been wrong (though upgrading to the latest SDK does seem to have fixed a lot of connection issues on iPhone). Based on the response from Pebble, there is an issue with the current firmware, which can bite you when Bluetooth disconnects and reconnects, or if your Bluetooth connection is flaky:

https://forums.pebble.com/t/how-to-recover-from-app-msg-busy-after-bluetooth-reconnects/22948 https://forums.pebble.com/t/app-msg-send-timeout-should-happen-but-never-does/22524

It seems that when there is Bluetooth flakiness, the latest Pebble firmware can cause the AppMessage system to get stuck in an unrecoverable state. For context: the Urchin watchface sends a message to the phone whenever it wants new data, but in this case, those messages are unable to be sent (the phone gets stuck waiting for a (n)ack from the phone). I'm not surprised to hear the xDrip watchface doesn't have this issue, since the xDrip Android app sends data straight to the watch whenever it has new a reading. (Likewise, pancreabble on an OpenAPS rig should work fine.) It's also consistent with reports that closing and reopening the app fixes it.

I could code around the problem, but any fix I introduce would just be a hack. The problem (which manifests as repeated Begin failed, Code 64) is in the firmware, at a level not exposed to application developers.

One hack would be to add a setTimeout to the JavaScript side to ensure that a minimum frequency of message exchange occurs... but this will use more battery on the phone, won't result in a great user experience (the watch will still show it's unable to send a message), and is fragile because PebbleKit JS can be stopped/restarted at any time (which is why the Pebble-recommended best practice is to drive the updates from the watch).

(Another idea was to call app_message_open again in the hope that it would reset the outgoing message state, but that didn't work.)

beached commented 7 years ago

Is there a known working version of the FW?

mddub commented 7 years ago

On my iPhone, FW 3.14 is stable as long as the phone stays within range. Seems to be a 33%-ish chance that it will end up in Begin failed, Code 64 state if the phone goes out of range. It's possible that the problem was introduced between 3.13 and 3.14, but I haven't tried to confirm that.

What's your experience been?