Tympan / Tympan_Library

Arduino/Teensy Library for Tympan Open Source Hearing Aid
MIT License
116 stars 31 forks source link

BLE too slow for transferring large App GUIs #53

Open chipaudette opened 2 years ago

chipaudette commented 2 years ago

Having switched to BLE, larger App GUIs take a long time to transfer.

The primary user issue is that the App says that it is connected (yay!) but the BLE transfer of the GUI might take another 20+ seconds. During that time, the user sees a blank GUI and has been provided no information to let them know...'just wait, it's coming!'. For a satisfying user experience, we should speed this up.

I propose two solutions:

1) Alter the Tympan Remote App to provide feedback to the user that the GUI is still being transferred. This is with @kxm-creare who is tracking it seperately. 2) Explore methods of speeding up the BLE transfer. This will be with @chipaudette and @ijenkins-creare

This issue will track our work to speed the BLE transfer.

chipaudette commented 2 years ago

To establish a baseline, let's use the App GUI from 05-FullSystems/WDRC_8BandFIR_full. As of last week, the GUI JSON string is 5346 characters. It took just about exactly 20 seconds to transfer. Ugh.

As mentioned earlier, one character over an 8-N-1 serial link actually consumes 9 bits, so the total transfer is 48,114 bits. By taking 20 seconds, the average data rate is 2405 bps, which is only 25% of what I would have hoped, based on the 9600bps max speed to the BT module.

This suggests that, besides just raw baudrate to the BT module, that there is some latency or inefficiency that could be removed to improve the throughput.

chipaudette commented 2 years ago

Possible Routes to improving speed:

1) Increase the baud rate of the serial link from the processor to the BC127. It defaults to only 9600 bps. 2) Reduce the characters needed for the GUI, but shortening the keywords used over-and-over to define the different elements of the GUI 3) Looking for latencies or timeouts or other inefficiencies in the communication between the processor and BC127 4) Changing parameters of the BLE protocol (MTU?)

chipaudette commented 2 years ago

I looked at solution approach (2) by reducing the characters used for the most common keywords.

In the big JSON string, we write the same keywords over and over again: ‘name’, 'id', ‘cmd’, ‘width’, ’label’, and 'buttons'. While that makes this very human readable, we could alter the app to also allow for shorter versions like ‘n’, 'i', ’c’, ’w’, ’l’, and 'b'.

On a big GUI example, I grabbed the JSON string and did a search and replace using the single-character substitutions above. The JSON character count dropped from 5358 to 4219, a reduction of 21%.

While nice, that doesn't seem big enough all by itself. So, let's skip solution approach (2) for now.

chipaudette commented 2 years ago

Pursuing solution approach (1) [faster baud rate], @ijenkins-creare sent me this message:

I pushed a new branch feature/baudrate. This includes a new example ChangeBC127Baudrate that illustrates how to change the baudrate on the bc127. Some complicating factors that I ran into:

Chip, I confirmed that your WDRC_8BandFIR_full example worked on my RevE v.7 firmware device. I also confirmed the 18-20 seconds it takes to transfer that terrible JSON string.

With the techniques described above, I updated the baudrate to 115200 and re-ran your WDRC_8BandFIR_full example and got a consistent 7-8 seconds for transfer time. 2.5X seems reasonable, no?

chipaudette commented 2 years ago

@ijenkins-creare , I agree that this seems worth it. Let me give it a try and then let me ponder it a bit.

chipaudette commented 2 years ago

@ijenkins-creare , based on your second bullet above ("The general process is...") it sounds like we cannot simply build in the baud rate shift into the BLE.begin()?? The issue is that, as you describe, it sometimes doesn't work?

Does adding a could delay(500) maybe make the "sometimes doesn't work" go away?

chipaudette commented 2 years ago

I tested the baud rate change. Being inexperienced, there were a lot of shenanigans. I did experience the excellent speedup in transfer, which is great. I then had a lot of trouble shifting the baud rate back.

Part of my problem is that, to do the factory reset, you need to have the right baud rate for whatever is the current speed setting. So, if you expect a speed that it is not operating at, you can't reset it. This means that you can't force a complete reset. This is bad. This means that remote debugging will be very challenging.

Can we do a factory reset from the GPIO pins that we're attached to?

chipaudette commented 2 years ago

@ijenkins-creare , I moved your example sketch to the Bluetooth examples.

Also, I found that I did not need to change the USB serial speed. On my windows machine, I was able to arbitrarily change the BT serial speed without having to change the USB serial speed. This greatly simplifies operation. This didn't work on your machine? Windows or Linux?

Also also, in your code, it looks like you never set the pointers USB_Serial or BT_Serial. As a result, the system bombed (repeatedly) whenever it'd get to the loop() function and tried to dereference these pointers. I replaced these two pointers simply with Serial and Serial1. Now it avoids the endless cycle of rebooting.

chipaudette commented 2 years ago

@ijenkins-creare To better handle the case of not-knowing the starting BT speed, I modified the logic to try both speeds. It starts at 9600 and, by using BLE.status(), I can tell if we're using the correct baud rate.

The check is simple: if status() returns an intelligible reply, we must be using the right baud rate. If there is no intelligible reply, we must be using the wrong baud rate, so I go ahead and automatically try the other baud rate.

Having it automatically sniff the starting baudrate in this way makes me think that we can do dynamic changing of the baudrate....that we could bake this into the BLE.begin() routine (or something) if we wanted to.

The new logic might look something like:

1) User calls BLE.begin() (or BLE.setupBLE(), which calls BLE.begin()) 2) ble.begin() uses the sniffing method in the example to find the current baud rate 3) Knowing the correct baudrate, we can now call for the factory-settings restore (like we do know) 4) After the factory restore, we increasethe baud rate of the BC127 from 9600 to 115200, per your example 5) We close and open Serial1 at 115200, like in the example 6) We continue adjusting the other BC127 settings, like we do in the existing begin() and setupBLE() methods

No where do we change the speed of the USB serial ("Serial"), which is probably key.

Thoughts?

chipaudette commented 2 years ago

@ijenkins-creare , I implemented the plan described above. I pushed it to the feature branch.

Now, you shouldn't have to tell it what the current baudrate is...it'll find it. Then, it'll change it to the fast rate. I tested it on my end and it seems to work fine. I tested it with the stock (unmodified) version of 05-FullSystems/WDRC_8BandFIR_full.ino.

Can you test on your machine for RevE and RevD?

It's this second bullet that revealed a lot of issues for me. I think that I took care of them.

If this code ends up not working, I'm hoping we can see where it breaks via the debug println() in the Serial Monitor.

But, if this code DOES work for you, I'll remove all the extra debug println() and then we can talk about the implications of moving it to Develop and then on to Main. I can also add it to Haley's pile for her more-extensive testing.

ijenkins-creare commented 2 years ago

Can confirm. RevE (v7) and RevD (v5) worked... with some slight caveats:

chipaudette commented 2 years ago

@ijenkins-creare , to the ChangeBC127Baudrate example, I added an attempt to force a restore of the module using the hardware pins. If this works, we won't have to use trial-and-error to find current baudrate. Instead, we just force it to restore via the hardware pins and then from the known condition of 9600.

I made the change to the example. Can you try it on your hardware? The test would be:

What do you find on RevD and RevE? (When testing RevD, don't forget to change the pin numbers as well as the firmware Rev)

ijenkins-creare commented 2 years ago

I cannot, for the life of me, get the new sketch to work on my Rev E or Rev D.

On RevE, my device was already at 115200. With my serial monitor open, I see the initial checks for 9600, the failure, the switch to 115200, and then when it gets to the HW reset... I never see any BC127 output for the reset. And afterwards, I cannot interact with the device... no "STATUS" or anything.

Similar story with RevD.

I'll give it another go tomorrow. Hoping for better luck. Maybe it was just a bad bluetooth day.

chipaudette commented 2 years ago

At he end of the day, I pushed a very re-worked version. I think that I got the hardware reset to work on both E and D. So, I dumped the sniff method and went for the hardware reset.

Besides the underlying library files, I updated:

I'm about to release to Haley, but as you had BT issues yesterday, your hardware is an even better test. Do the new sketches work?!? I can't wait to hear!

chipaudette commented 2 years ago

@hgeithner-creare , as you can see, there's been a lot of work to try to speed the transfer of large GUIs. The only thing that has been effective is changing the speed of the communication between the Teensy processor and the Bluetooth module (the so-called "baud rate").

There is a history of this being both hard to change (inconsistent whether the change actually takes hold) and that it persists across reboots. Both of these facts mean that it has traditionally been something that is really hard to debug if something goes wrong. As a result, I've been very nervous about rolling it out to our users.

Now, though, I think that I've gotten a routine that forces the BC127 to always reset itself to the factory baud rate through hardware features that it has (raising and lower voltages on certain pins so that the hardware resets without the need for any software commands from the Teensy). As a result, I think that I can always get it back to a known-good state from where we can command the higher baud rate. It seems to work fine on both my RevD and RevE.

This is feature/baudrate. Please pick an example from the AppTutorial and then also test FullSystems/WDRC_8BandFIR_full. Please test on both RevD and RevE (though RevE is much more important, if you have limited time). Please test while watching the SerialMonitor. Things to test:

1) Does is compile and upload 2) Watching the SerialMonitor, do you see it say that it is doing a hardware reset and then changing to 115200 and then are you seeing he BC127 version info ("blah blah blah...Melody V7.3...blah blah blah" 3) Can you connect from your phone

Then, most important:

4) When the sketch works, cycle the tympan power, and test the same sketch again!

It's this second test that is most critical because the Tympan will already be in 115200 mode (as opposed to the traditional 9600 mode) and the hardware reset is critical to getting the system to work! It's this second test that is likely to break things!

Chip

@eyuan-creare , you might want to read this comment to Haley. It's an intro to the change that is likely about to happen to the library. It should only affect BLE / App-enabled examples. It should not affect any of the Basic sketches. If the BT speed changes were to fail, the audio processing should all continue to work fine...only the App interaction will not work.

ijenkins-creare commented 2 years ago

@chipaudette , I can confirm the latest ChangeBC127Baudrate works great on both RevE(v7) and RevD(v5). I had no issue with either device switching and interacting at the new rate(s).

As for the WDRC_8BandFIR_full, the RevE worked as expected. The RevD worked with one caveat: When I reset the Tympan after a successful connection/interaction with the App, the App kept the device in the paired list (for some amount of re-scans); however, would no longer connect. After maybe 2 or 3 "Scans", the device was removed from the Paired list; however, I couldn't get the device to appear in the list after scanning... I'm attaching a screenshot below. The log shows the app scanning and finding the Tympan, but it never appears in the Paired list and you can't connect to it. I needed to close the app and restart it to Scan and connect again. Maybe this is just me? Maybe @kxm-creare has an idea? This is the latest version, 1.3.0, of the app.

Screenshot_20210827-202842_Tympan Remote

After the dance, the app and Tympan worked as expected with the desired speed increase.

kxm-creare commented 2 years ago

I don't see this behavior when I try, although I'm not quite clear on exactly the steps you took. However, it sounds like the view is not representing the state of the model; I can throw a digest command in there to force it to update the view and see if that helps at all. Let me try to release a 1.3.1.

chipaudette commented 2 years ago

I just moved the faster baud rate stuff to Develop. Now, any example sketch should test the Tympan's ability to hardware reset its BC127 and to increase the baudrate up to 115200. It'll do this on every sketch that invokes BLE.

Every sketch should print a bit of debug to the SerialMonitor to ensure that this happens OK. It should also print the current BC127 firmware info to the Serial Monitor (actually, I think that the Main branch should do this version reporting, too).

hgeithner-creare commented 2 years ago

Hardware: RevE OS: iOS App: v1.3.1 Branch: develop

Hardware: RevE OS: Android App: v1.3.1 Branch: develop

chipaudette commented 2 years ago

@kxm-creare , what do you make of Haley's latest report...that she was fine (enough) on Android but that it failed on iOS?

I talked with Haley and she said that she watched the Tympan's serial monitor and saw that it was doing things like transmitting the JSON string and the updated button values. So, it seems like the iOS app just isn't happy with us.

Can you try one of the examples (develop branch) that Haley said didn't work? Does it work for you? If it works for you, but not for Haley, can you work with Haley to figure it out?

(I fear that it only fails because we've increased the baudrate between the Teensy and the BC127. I don't see how the iOS device would see any influence of this, but that's my fear...that the faster rate induces the BC127 to change how it communicates with the phone and that the iOS device is unhappy about that.)

chipaudette commented 2 years ago

I just got a whole lot of not connecting / can't see it in the rescan nastiness. Like my old bad BT days. Very frustrating. I tried two different Tympan RevE and get different experiences of sadness. This is with V1.3.1, Tympan RevE, confirmed V7.3 HD firmware.

I've unpaired and re-paired. Sometimes it shows up in the Tympan app, but then fails to connect. The Tympan serial monitor doesn't even show that a connection has opened (to say nothing of the JSON). So frustrating! Grr!!

I'm rebooting the phone now...

chipaudette commented 2 years ago

Now, it will show up in the scanned list, but when I go to connect to it, the Tympan serial monitor shows a OPEN followed within 1-2 seconds by a CLOSE.

BLE: lookForOpenOrClosedMsg: received OPEN_OK for BLE Link 14
BLE: lookForOpenOrClosedMsg: received CLOSE_OK for BLE Link 14

The colors of the BT's LEDs are consistent with the open/close behavior described by the sequence of the OPEN and CLOSE messages.

This was all with BasicGain_wApp. I tried again with TrebleBoost_wApp and get the same behavior. :(

I tried with the unit unplugged from USB. Same result. :(

I tried again with the unit commanded to stay at 9600 baud instead of jumping up to 115200. Same result. :(

And, finally, I reverted back to the Main (not Develop) branch so that It was for-sure not changing baud rate. Same result. :(

kxm-creare commented 2 years ago

@wea @hgeithner-creare I did some testing this morning with my Rev D Tympan. Here are my results:

Hardware: RevD OS: iOS App: v1.3.1 Branch: develop (5-Sep-2021, 11:36 https://github.com/Tympan/Tympan_Library/commit/07a2d71aa6e235283b2bae5ca0b58364aca6a424)

@hgeithner-creare can I borrow your RevE today so I can do more troubleshooting?

hgeithner-creare commented 2 years ago

@hgeithner-creare can I borrow your RevE today so I can do more troubleshooting?

Yes, it's on my desk.

kxm-creare commented 2 years ago

I did some testing with a RevE today. I don't have trouble with Android, but I do have trouble with iOS. The iOS version of the app seems to think some of the packets are arriving out of order. I've never seen this before, and it isn't happening on the RevD. @ijenkins-creare is there anything that changed that may be making this happen from the Tympan side? Maybe it is the v7 firmware? I'll do some digging on my end, but any insights from the device hardware side would be helpful.

ijenkins-creare commented 2 years ago

What changed is the baudrate change? Maybe it's sending, but I'm not sure that actually affects what goes out the BLE wire?

hgeithner-creare commented 2 years ago

Hardware: RevE OS: iOS App: v1.3.1 Branch: develop

Hardware: RevE OS: iOS App: v1.3.1 Branch: main

TrebleBoost_wComp_wApp: Starting setup()... BLE: begin: restore() returned error -1 BLE: begin: set BT_STATE_CONFIG returned error -1 BLE: begin: writeConfig() returned error -1 BLE: begin: reset() returned error -1 BLE: setupBLE: ble did not begin correctly. error = -1 : -1 = TIMEOUT ERROR : 0 = GENERIC MODULE ERROR

I could not connect to the Tympan at all even though it said BLE: updateAvertising: activating BLE advertising in the serial monitor. It seems the baud rate could be causing problems for iOS.

chipaudette commented 2 years ago

@hgeithner-creare, Thanks for the testing!

Based on your error messages when you ran 57600, it looks like it the 57600 didn't really function correctly within the Tympan...nothing to do with the phone or the App.

For the trial at 57600, there are many ways of trying it, most of which won't work. The only way that would work would be to 1) switch to Develop, 2) go into the source code for the Tympan Library and open BLE.cpp, 3) near the top of the file, change faster_baudrate to 57600, 4) recompile TrebleBoost_wComp_wApp and run.

If you tried some other method, like using ChangeBaudrate, that won't work.

*** Begin unnecessary digression ChangeBaudrate is really only useful for getting a Tympan back into a state that works with the Main branch after you've been using the Develop branch. I mean, ChangeBaudrate will change the baudrate, but any code in Main only knows how to work in 9600 so changing away from 9600 won't make a working system. Furthermore, if you were to try an code in Develop, anything in Develop will blow away whatever ChangeBaudrate does and it will set its own speed.

Or, to say this all again:

So, if you want to try a speed other than 9600, you have to use Develop.

A side effect of Develop, however, is that whatever baudrate you use will end up being retained by the device. It's this memory of the last baud rate that prevents you from going back to Main. If you were to leave it in any state other than 9600, any example from Main will fail. That's why you need to switch the hardware back to 9600 before switching back to Main. *** End unnecessary digression

If you're willing to retry the 57600 case using the method in the numbered bullets above, that would be appreciated. For your own sake, don't forget to change BLE.cpp faster_baudrate back to 115200 when you're done!

hgeithner-creare commented 2 years ago

@chipaudette I retested using your method above. The same issue (packets not being received in the correct order and getting stuck in "Receiving...") occurred when using a baud rate of 57600. :-(