Xylopyrographer / STAC

A Roland Smart Tally Atom Client
Other
3 stars 0 forks source link

Timeout Tracking 1.10 base #44

Closed mqshaw closed 2 years ago

mqshaw commented 2 years ago

@Xylopyrographer

I tested the code last night at church and the team told me that the device behaved as expected. I have also added some descriptions in the documentation files, with a TODO - talking about the addition of the polling loop to the embedded web server.

Xylopyrographer commented 2 years ago

Good day Mark. Do hope all is well. Thanks for taking the time to integrate this into the master.

Looking to better understand what condition(s) need to be met to throw up the Big Purple X (BPX) on the display.

Is it right to state the overall intent as:

if we have been attempting to get the tally status from the Smart Tally Server (STS), and it is now longer than two seconds since the last good reply, throw up the BPX.

or

if the last eight consecutive attempts to retrieve the tally status from the STS have failed within a period of two seconds, throw up the BPX.

where one attempt is:

send a GET request to the STS
if no reply within X ms, 
    send another GET request
    repeat the GET, wait, try thing up to 10 times
    if still no reply from the STS, abort
otherwise,
    return the tally status

Or something all together different?

Interesting looking over the code now. STAC was my first ever attempt at writing in C++. I was learning that and the whole Arduino thing at the same time as figuring out the hardware. I might do things a bit different now πŸ˜„.

Xylopyrographer commented 2 years ago

@mqshaw Howdy. Never sure if this systems notifies you without an explicit "@" reference, so just making sure. Sent you a note a couple days ago on the filtering. Hope you're having a good one.

mqshaw commented 2 years ago

Hi Rob,

Yes, I go the note, I just have not had time yet to respond to it.

The overall intent is two-fold.

  1. Keep track of connection issues, try to resolve better
  2. If the connection is unable to connect, it will try again up to 1 second, in 200ms intervals.
    • If a connection cannot be made in 1 second, the BPX is thrown.

There are 2 checks that are happening.

  1. First in checkTallyState() lines 320-380

Specifically looking for failed WiFi connections, and rather than putting up a BPX right away, retry ( up to ST_ATTEMPTS times )

If the connection fails ST_ATTEMPTS times, we then set the tracking bit field and show the BPX.

  1. Secondly in loop() 1561

To stop the BPX from switching on and off, the tracking is implemented.

It checks to make sure that the BPX is only shown if the last 1-thought-8 bits of the tracking field have been implemented.

Does this help?

Mark

From: Xylopyrographer @.> Sent: Thursday, March 24, 2022 11:20 AM To: Xylopyrographer/STAC @.> Cc: Mark Shaw @.>; Mention @.> Subject: Re: [Xylopyrographer/STAC] Timeout Tracking 1.10 base (PR #44)

@mqshawhttps://github.com/mqshaw Howdy. Never sure if this systems notifies you without an explicit "@" reference, so just making sure. Sent you a note a couple days ago on the filtering. Hope you're having a good one.

β€” Reply to this email directly, view it on GitHubhttps://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1077857871, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANXCSGJ7RQDVDG6Z7MDZHALVBSP4PANCNFSM5O7K2D4A. You are receiving this because you were mentioned.Message ID: @.**@.>>

beamholder commented 2 years ago

Thanks Mark.

Useful indeed πŸ‘. It’ll be next week before I have a chance to get back around to this.

Cheers, Rob

On Mar 24, 2022, at 9:00 PM, Mark Shaw @.***> wrote:

Hi Rob,

Yes, I go the note, I just have not had time yet to respond to it.

The overall intent is two-fold.

  1. Keep track of connection issues, try to resolve better
  2. If the connection is unable to connect, it will try again up to 1 second, in 200ms intervals.
    • If a connection cannot be made in 1 second, the BPX is thrown.

There are 2 checks that are happening.

  1. First in checkTallyState() lines 320-380

Specifically looking for failed WiFi connections, and rather than putting up a BPX right away, retry ( up to ST_ATTEMPTS times )

If the connection fails ST_ATTEMPTS times, we then set the tracking bit field and show the BPX.

  1. Secondly in loop() 1561

To stop the BPX from switching on and off, the tracking is implemented.

It checks to make sure that the BPX is only shown if the last 1-thought-8 bits of the tracking field have been implemented.

Does this help?

Mark

From: Xylopyrographer @.> Sent: Thursday, March 24, 2022 11:20 AM To: Xylopyrographer/STAC @.> Cc: Mark Shaw @.>; Mention @.> Subject: Re: [Xylopyrographer/STAC] Timeout Tracking 1.10 base (PR #44)

@mqshawhttps://github.com/mqshaw Howdy. Never sure if this systems notifies you without an explicit "@" reference, so just making sure. Sent you a note a couple days ago on the filtering. Hope you're having a good one.

β€” Reply to this email directly, view it on GitHubhttps://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1077857871, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANXCSGJ7RQDVDG6Z7MDZHALVBSP4PANCNFSM5O7K2D4A. You are receiving this because you were mentioned.Message ID: @.**@.>> β€” Reply to this email directly, view it on GitHub https://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1078605612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGVOU5O5MZF7ZJUTCVX3G7DVBUT57ANCNFSM5O7K2D4A. You are receiving this because you are subscribed to this thread.

Xylopyrographer commented 2 years ago

Good evening Mark:

Had a chance to dig into things. Commenting here on your last reply.

On Mar 24, 2022, at 9:00 PM, Mark Shaw @.***> wrote:

The overall intent is two-fold.

  1. Keep track of connection issues, try to resolve better
  2. If the connection is unable to connect, it will try again up to 1 second, in 200ms intervals.
    • If a connection cannot be made in 1 second, the BPX is thrown.

Makes sense.

There are 2 checks that are happening.

  1. First in checkTallyState() lines 320-380

Specifically looking for failed WiFi connections, and rather than putting up a BPX right away, retry ( up to ST_ATTEMPTS times )

If the connection fails ST_ATTEMPTS times, we then set the tracking bit field and show the BPX.

At this point in the code, the checks are being done to determine if we are connected to the ST server and to attempt to make that connection if we are not. While the inability to connect to the STS could be due to loss of WiFi we don't know if we have lost WiFi until we test for it again back in loop(). So at this point we assume good WiFi and try as you say up to ST_ATTEMPTS times to connect. If we can't, agree that this condition needs immediate reporting to the user (throw up the BPX).

  1. Secondly in loop() 1561

To stop the BPX from switching on and off, the tracking is implemented.

It checks to make sure that the BPX is only shown if the last 1-thought-8 bits of the tracking field have been implemented.

Got it. πŸ‘


Agree that error handling to date is a bit weak. So, came up with a new version that does a bunch of things. Attaching for your review & comment a beta version:

Version: 1.11_ß5
    - Integrates 'BPX' filtering.
        - additions to getTallyStatus():
          - .tStatus now returns error conditions as well as STS replies.
        - rewite of the loop() tally display code to better facillitate error handling.
    - Adds user configurable polling interval via the web config page.
      - adds polling interval as an NVS item.
      - increments the NVS NOM_PREFS_VERSION.
    - Corrects improper use of the Preferences library.
    - No longer uses the M5 Atom libraries:
      - uses JC_Button library for the display button;
      - uses I2C_MPU6886 library for the IMU;
      - uses FastLED directly for the display;
      - directly initializes Serial.    
    - Replaces "M5.dis.X" functions with bespoke display drawing routines.
    - No longer reformats the NVS when doing a factory reset.
    - Modified the layout of the startup data dump.
      - displays the polling interval.
      - items above the "-----" line are hard coded.
      - items above the "=-=-=" line are web config items.
      - items below the "=-=-=" line are run-time configurable.
    - Adds macros for setting the tally state on the GROVE connector.
      - have yet to verify this didn't break this for this build.

To compile, you'l need to use the library manager to install the JC_Button & I2C_MPU6886 libraries.

You'll have FastLED already on your system.

If you jump into the getTallyStatus function, you'll notice I added human readable forms to various responses and conditions that can arise during the whole communications sequence with the STS. Downstream code is still keyed off the status flags though.

This routine only handles the comms and queries to the STS, returning what it gets and leaving all decisions about what to do with that to the code in loop().

Most of the tally status handling code in loop() was redone. The "top" part deals with what we would expect when everything is operating as it should and the "bottom" part is now an error handler.

All the debugging code is still there, hopefully that is useful for you.

The BPX code from your lines 320-380 is in there. For debugging and discussion I changed the colour of the X to orange to distinguish for now a "no STS available" error from a "real" BPX error.

I recoded the implementation of the "bit shifter" as I understood it. It still requires 8 consecutive "no reply" states before it is triggered and displays the BPX.

We should have a discussion on the rate at which errors are reported and how aggressive we should be to recover. For example, if we can't connect to the STS, do we keep hitting it as hard as we can or wait to the next polling interval? Same question if we receive a garbage replay from the STS. Keep hitting the STS hard or wait until the next polling cycle? For example, if you look at line 1776, I've commented out the "hit'em fast" option in favour of waiting for the next polling interval (also made debugging a lot easier as it slowed down the flood to the monitor).

I'm sure there is a more "GitHub-ish" way to do this but I've attached this version to this message.

For testing, I set the polling interval to 300ms and used the attached python STS emulator. It injects garbage replies and also goes to sleep at intervals to test those conditions. Haven't migrated it to python 3 and the 2to3 utility barfs at it.

To test the "no STS server available" condition, I fire up the STAC and the emulator and then CTRL-C the emulator to stop it, check the STAC response & then restart the emulator.

It's a bit of a rewrite for sure. Looking very stable though, with much improved error response.

I'm open to a zoom call if you think it be easier.

Bit of a side bar: I also wrote an API document and a Tutorial for the Preferences library. It's now part of the official arduino-esp32 project. Check it out. Preferences API & Preferences Tutorial

Attachments: STAC_1.11_B5.ino.zip rolandstserverQ2_junk_noReply.py.zip

Xylopyrographer commented 2 years ago

@mqshaw Keep meaning to ask: what is your IDE setup? VSC with ???.

Xylopyrographer commented 2 years ago

@mqshaw Made a few more additions & fixed a couple of bugs. Details in the top of the file. Each STAC now uses its ID as its hostname so each will show up as a unique device on the WiFi router table.

STAC_1.11_RJL.ino.zip

mqshaw commented 2 years ago

Hi Rob

Thanks for the information, i will have to dig into it in a week or so. The next 2 weeks are crazy for me.

We are competing in a robotics competition this weekend and next, and have made it to be the top ranked team. FRC team 2122 if you are interested in seeing what we do

https://youtu.be/hFlMA5pP2DA

Regarding my development environment, i use the arduino IDE for basic stuff, but i like the VS Code with Arduino extensions to allow me to more easily debug, and trace through the code. For my other (robotics) stuff i am using Visual Studio 19.

Mark


From: Xylopyrographer @.> Sent: Wednesday, March 30, 2022, 9:09 PM To: Xylopyrographer/STAC @.> Cc: Mark Shaw @.>; Mention @.> Subject: Re: [Xylopyrographer/STAC] Timeout Tracking 1.10 base (PR #44)

Good evening Mark:

Had a chance to dig into things. Commenting here on your last reply.

On Mar 24, 2022, at 9:00 PM, Mark Shaw @.***> wrote:

The overall intent is two-fold.

  1. Keep track of connection issues, try to resolve better
  2. If the connection is unable to connect, it will try again up to 1 second, in 200ms intervals.

    • If a connection cannot be made in 1 second, the BPX is thrown.

Makes sense.

There are 2 checks that are happening.

  1. First in checkTallyState() lines 320-380

Specifically looking for failed WiFi connections, and rather than putting up a BPX right away, retry ( up to ST_ATTEMPTS times )

If the connection fails ST_ATTEMPTS times, we then set the tracking bit field and show the BPX.

At this point in the code, the checks are being done to determine if we are connected to the ST server and to attempt to make that connection if we are not. While the inability to connect to the STS could be due to loss of WiFi we don't know if we have lost WiFi until we test for it again back in loop(). So at this point we assume good WiFi and try as you say up to ST_ATTEMPTS times to connect. If we can't, agree that this condition needs immediate reporting to the user (throw up the BPX).

  1. Secondly in loop() 1561

To stop the BPX from switching on and off, the tracking is implemented.

It checks to make sure that the BPX is only shown if the last 1-thought-8 bits of the tracking field have been implemented.

Got it. πŸ‘


Agree that error handling to date is a bit weak. So, came up with a new version that does a bunch of things. Attaching for your review & comment a beta version:

Version: 1.11_ß5

- Integrates 'BPX' filtering.

    - additions to getTallyStatus():

      - .tStatus now returns error conditions as well as STS replies.

    - rewite of the loop() tally display code to better facillitate error handling.

- Adds user configurable polling interval via the web config page.

  - adds polling interval as an NVS item.

  - increments the NVS NOM_PREFS_VERSION.

- Corrects improper use of the Preferences library.

- No longer uses the M5 Atom libraries:

  - uses JC_Button library for the display button;

  - uses I2C_MPU6886 library for the IMU;

  - uses FastLED directly for the display;

  - directly initializes Serial.

- Replaces "M5.dis.X" functions with bespoke display drawing routines.

- No longer reformats the NVS when doing a factory reset.

- Modified the layout of the startup data dump.

  - displays the polling interval.

  - items above the "-----" line are hard coded.

  - items above the "=-=-=" line are web config items.

  - items below the "=-=-=" line are run-time configurable.

- Adds macros for setting the tally state on the GROVE connector.

  - have yet to verify this didn't break this for this build.

To compile, you'l need to use the library manager to install the JC_Button & I2C_MPU6886 libraries.

You'll have FastLED already on your system.

If you jump into the getTallyStatus function, you'll notice I added human readable forms to various responses and conditions that can arise during the whole communications sequence with the STS. Downstream code is still keyed off the status flags though.

This routine only handles the comms and queries to the STS, returning what it gets and leaving all decisions about what to do with that to the code in loop().

Most of the tally status handling code in loop() was redone. The "top" part deals with what we would expect when everything is operating as it should and the "bottom" part is now an error handler.

All the debugging code is still there, hopefully that is useful for you.

The BPX code from your lines 320-380 is in there. For debugging and discussion I changed the colour of the X to orange to distinguish for now a "no STS available" error from a "real" BPX error.

I recoded the implementation of the "bit shifter" as I understood it. It still requires 8 consecutive "no reply" states before it is triggered and displays the BPX.

We should have a discussion on the rate at which errors are reported and how aggressive we should be to recover. For example, if we can't connect to the STS, do we keep hitting it as hard as we can or wait to the next polling interval? Same question if we receive a garbage replay from the STS. Keep hitting the STS hard or wait until the next polling cycle? For example, if you look at line 1776, I've commented out the "hit'em fast" option in favour of waiting for the next polling interval (also made debugging a lot easier as it slowed down the flood to the monitor).

I'm sure there is a more "GitHub-ish" way to do this but I've attached this version to this message.

For testing, I set the polling interval to 300ms and used the attached python STS emulator. It injects garbage replies and also goes to sleep at intervals to test those conditions. Haven't migrated it to python 3 and the 2to3 utility barfs at it.

To test the "no STS server available" condition, I fire up the STAC and the emulator and then CTRL-C the emulator to stop it, check the STAC response & then restart the emulator.

It's a bit of a rewrite for sure. Looking very stable though, with much improved error response.

I'm open to a zoom call if you think it be easier.

Bit of a side bar: I also wrote an API document and a Tutorial for the Preferences library. It's now part of the official arduino-esp32 project. Check it out. Preferences APIhttps://docs.espressif.com/projects/arduino-esp32/en/latest/api/preferences.html & Preferences Tutorialhttps://docs.espressif.com/projects/arduino-esp32/en/latest/tutorials/preferences.html

Attachments: STAC_1.11_B5.ino.ziphttps://github.com/Xylopyrographer/STAC/files/8385537/STAC_1.11_B5.ino.zip rolandstserverQ2_junk_noReply.py.ziphttps://github.com/Xylopyrographer/STAC/files/8385538/rolandstserverQ2_junk_noReply.py.zip

β€” Reply to this email directly, view it on GitHubhttps://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1084025211, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANXCSGMLLMSYWRS7XZXWX3LVCUJP7ANCNFSM5O7K2D4A. You are receiving this because you were mentioned.Message ID: @.***>

Xylopyrographer commented 2 years ago

Mark:

That is incredible. And a lot of fun. Great project. Congrats and good luck in the upcoming competition!

Cheers, Rob

Xylopyrographer commented 2 years ago

@mqshaw Good day Mark. I've been poking about and am attaching the latest beta for you to take a look at when you've time. Changes are listed in the main .ino file. Most major change is I've split the single sketch file into a bunch of others all living in the STACLib sub directory. Things were getting a bit too unwieldy for a single file.

Also did a few things such that the sketch now compiles and runs using either arduino-esp32 core version 1.0.6 or the latest core version 2.0.3-rc1. With core 2.0.2 and later, no longer need to use the Preferences library files from earlier versions of STAC as those changes have been rolled into the master branch.

If you want to see all the nitty gritty debug stuff logged to the serial monitor, set the DB_MODE #define at the top of the sketch to 1 & the Core Debug Level to "Verbose".

Anyway, no rush. Hope you're having fun with the robotics competitions!

STAC_1.11B9.zip

mqshaw commented 2 years ago

@Xylopyrographer
Hello Rob, The competition went great, we were the second ranked team in our division (75 robots), and unfortunately got eliminated in the semi-finals, so did not make it to the final elimination bracket, but it was an excellent experience for the kids.

OK, so I am trying to figure out where your code is for me to take a look at. Is it on a branch somewhere? I would like to see this pull request closed, so if you can point me at the branch with the integrated changes I will close this PR.

Thanks Mark

beamholder commented 2 years ago

Hi Mark.

You’ll get β€˜em next year πŸ˜€. Like you said, great experience for the kids.

I attached the code as a zip file to this thread. Latest one being a day or two ago. Should be there on one of the messages? Not used this feature on GitHub before so let me know if I need to use another method.

Cheers, Rob

On Apr 30, 2022, at 10:49 AM, Mark Shaw @.***> wrote:

ο»Ώ @Xylopyrographer Hello Rob, The competition went great, we did not make it to the finals, but it was an excellent experience for the kids.

OK, so I am trying to figure out where your code is for me to take a look at. Is it on a branch somewhere? I would like to see this pull request closed, so if you can point me at the branch with the integrated changes I will close this PR.

Thanks Mark

β€” Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

Xylopyrographer commented 2 years ago

@mqshaw Good day Mark. Checking to see if you've had a chance try out the latest version from the zip file above?

mqshaw commented 2 years ago

@mqshaw Good day Mark. Checking to see if you've had a chance try out the latest version from the zip file above?

Hi Rob I have looked at the code, but am not sure if it includes all the pieces that i need. Maybe we can set up a time to do a zoom call and discuss it together.

Would you be up for that?

Mark

beamholder commented 2 years ago

Mark:

For sure. Zoom works. I’m pretty open schedule-wise. Let me know what works for you.

Cheers, Rob

On May 19, 2022, at 2:02 PM, Mark Shaw @.***> wrote:

@mqshaw https://github.com/mqshaw Good day Mark. Checking to see if you've had a chance try out the latest version from the zip file above?

Hi Rob I have looked at the code, but am not sure if it includes all the pieces that i need. Maybe we can set up a time to do a zoom call and discuss it together.

Would you be up for that?

Mark

β€” Reply to this email directly, view it on GitHub https://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1132144061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGVOU5PT4WTTH357PCY2D7TVK2M5PANCNFSM5O7K2D4A. You are receiving this because you commented.

mqshaw commented 2 years ago

Hi @Xylopyrographer,

This tool is weird sometimes, I sent you a reply last weekend ( using email ), but its not showing up in the logs. So maybe you never received it ...

Sorry for the delayed response. I do a lot of work with Asia, so my days start later. Mornings work good for me. Are you free Thursday (16th) this week in the morning.

Mark

mqshaw commented 2 years ago

@mqshaw Keep meaning to ask: what is your IDE setup? VSC with ???.

@Xylopyrographer I have also got the new version of the code running / compiling using VSCode. You need to use the following profile in the same directory as your code, and it configures VS Code to use the Arduino board configuration. yuo will want to update the path to the Arduino libraries on your system.

vscode_config.zip

Also, remember i told you a while ago that I was thinking of how to mount the STAC on my wireless cameras. Here is a 3D printed mount that my son made just for STAC!

20220611_100150_sm

Xylopyrographer commented 2 years ago

@mqshaw Hey Mark. Sorry, I didn't get the note about the 9th. Hope this comes through.

Next week I'm pretty open. Appointment Monday from 2-3:00, Wednesday from noon to 3:00 and Thursday evening is booked. All clear otherwise. Let me know what works for you.

Nice job on the custom STAC mount! It's great when we get to engage our kids as well.

All the best.

Xylopyrographer commented 2 years ago

@mqshaw

Good day Mark. Re-read your last note. Yes, morning of Thursday 16 June works for me. Shoot me a DM with the time. I can plug into pretty much any meeting software. Let me know your preference. πŸ‘

mqshaw commented 1 year ago

Hi Rob,

Sorry for the delayed response.

I do a lot of work with Asia, so my days start later. Mornings work good for me. Are you free Thursday (9th) this week in the am?

Mark

From: beamholder @.> Sent: Thursday, May 19, 2022 3:35 PM To: Xylopyrographer/STAC @.> Cc: Mark Shaw @.>; Mention @.> Subject: Re: [Xylopyrographer/STAC] Timeout Tracking 1.10 base (PR #44)

Mark:

For sure. Zoom works. I’m pretty open schedule-wise. Let me know what works for you.

Cheers, Rob

On May 19, 2022, at 2:02 PM, Mark Shaw @.<mailto:@.>> wrote:

@mqshaw https://github.com/mqshaw Good day Mark. Checking to see if you've had a chance try out the latest version from the zip file above?

Hi Rob I have looked at the code, but am not sure if it includes all the pieces that i need. Maybe we can set up a time to do a zoom call and discuss it together.

Would you be up for that?

Mark

β€” Reply to this email directly, view it on GitHub https://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1132144061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGVOU5PT4WTTH357PCY2D7TVK2M5PANCNFSM5O7K2D4A. You are receiving this because you commented.

β€” Reply to this email directly, view it on GitHubhttps://github.com/Xylopyrographer/STAC/pull/44#issuecomment-1132228433, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANXCSGPXKO5W37MMSCMZQJTVK2XZHANCNFSM5O7K2D4A. You are receiving this because you were mentioned.Message ID: @.**@.>>