WouterJD / FortiusANT

FortiusANT enables a pre-smart Tacx trainer (usb- or ANT-connected) to communicate with TrainerRoad, Rouvy or Zwift through ANT or Bluetooth LE.
GNU General Public License v3.0
146 stars 78 forks source link

ANT+ messages lost - refactoring of ANT-loop required? #341

Closed WouterJD closed 1 year ago

WouterJD commented 2 years ago

At times it seeems that ANT-messages are not received, causing HRM-connection being lost or other unexplainable inconveniences. The suggestion is done to refactor the ANT-loop, this issue is meant to

See also Steering - packet drops

WouterJD commented 2 years ago

Let's check the main loop that starts here

This loop operates as follows:

while always
  read from Tacx Trainer

  update display, TCX, JSON

  calculate PedalStrokeAnalysis (PSA)

  every quarter second:
    prepare messages for HRM, PWR, SCS, FE-C
    send/receive messages to/from blue tooth devices

  the following occurs only if there are messages to be sent, so once per quarter second as well
    send/receive ANT+ messages
    handle all received ANT+ messages

  delay the loop, so that it will cycle at the required speed

The loop can operate on two speeds:

WouterJD commented 2 years ago

The ANT+ reading is done every 250ms into a 1000 character buffer (for details click here). One read-operation can provide multiple packets, FortiusAnt expects only entire ANT+packets to be delivered, otherwise error-messages are given:

Since these messages never (or perhaps more honestly: rarely) appear:

WouterJD commented 2 years ago

The question now is whether packets are lost because FortiusAnt reads data from the USB-driver as described. Also: would a sparate thread that is dedicated to read data from the USB-driver resolve the issue?

Currently, I think that we should be aware that we read data from the USB-device driver that in it's turn reads data from the USB-dongle. I would expect the USB-device driver to have it's own process, handling the USB-dongle and provides data to FortiusAnt when we do the read-command.

In line with that thought - but I do not know device driver internals, let alone I ever created such - I would not expect a separate program-thread to help in resolving the lost packet issue.

WouterJD commented 2 years ago

Issues that have been observed:

I will do a test with a USB-hub, bringing the dongles closer to the bike (and myself and HRM) to check whether this affects the HRM-issue. Perhaps this also gives usefull information for Black track steering and ANT-based trainers.

So far my input on the ANT+ messages lost issues for now :-)

switchabl commented 2 years ago

@WouterJD Thank you for collecting all the info in one place!

Currently, I think that we should be aware that we read data from the USB-device driver that in it's turn reads data from the USB-dongle. I would expect the USB-device driver to have it's own process, handling the USB-dongle and provides data to FortiusAnt when we do the read-command.

In line with that thought - but I do not know device driver internals, let alone I ever created such - I would not expect a separate program-thread to help in resolving the lost packet issue.

That sounds like a reasonable assumption, but unfortunately libusb is a lot more low-level than you might expect. The USB stack is not constantly receiving data from the device and buffering it for you (like a network socket). What happens when you call read is roughly this:

Put another way, when you call read matters (so does the timeout). This is not between your program and the driver. It directly affects the device. If you don't call it often enough, the internal buffer of your device may overflow. If you cancel the transfer (i.e. a timeout occurs), there may still be data in flight. With the libusb-win32 implementation that data will simply be lost.

WouterJD commented 2 years ago

@switchabl thanks for clarification. Input for further investigation.

switchabl commented 2 years ago

What I have in mind is something very roughly like this:

class clsAntDongle:
   def __init__():
      ...
      self._Messages = Queue()
      self._ReadThread = threading.Thread(target=_ReadLoop)
      self._ReadThread.start()

   def Read():
      return self._Messages.get()

   def Write(messages):
      # write directly to USB as before
      # but remove any read-related code
      for msg in messages:
         self.devAntDongle.write(0x01, msg)

   def _ReadLoop():
      while True:
         data = self.devAntDongle.read(0x81, 4096, 0) # no timeout (or some large value)
         ...
         # reassemble messages from data
         ...
         for msg in messages:
            self._Messages.put(msg)

This is of course missing a lot of detail and error handling. But it is not that complicated and it would not really change interface much. I will implement something over the weekend.

WouterJD commented 2 years ago

I though of something like this, open question from my side is: are you allowed to write() when the read() is still active?

Note: the thread should be initiated in getDongle() when the dongle is found, but as said thats part of the details.

WouterJD commented 2 years ago

Shall I pick this up and see where I get; then you can focus on the steering.

WouterJD commented 2 years ago

PS. Is multi-threading enough or is multi-processing required?

For my reference: https://docs.python.org/3/library/threading.html https://docs.python.org/3/library/multiprocessing.html

switchabl commented 2 years ago

Threading is fine, the thread will sleep 99% of the time anyway (waiting for data). Also, multi-processing is for people who enjoy pain. -.-

If you don't mind, I was actually hoping to take a stab at this myself. This is the main issue blocking steering and I want to see if it really solves my problem (I expect it will). I also have a very good idea what to do already. I don't think it will take me long. As you can maybe guess from the discussion, I have been there before. I have spent a lot of time reading libusb and driver/kernel code and figuring out how it actually works in detail at one point. There is no reason for you to repeat that ordeal.

Edited: HRMquestion was offtopic. To keep this discussion clean.

WouterJD commented 2 years ago

Welcome to implement; I would think you/me would subclass the clsAntDongle and make it multithreaded.

I would expect

Anxiously waiting for your code😉

WouterJD commented 2 years ago

Hi @switchabl Good morning.

reassemble messages from data

In _ReadLoop(), I would put the raw data in the queue and interpret the contents outside the thread, to do as less as possible. --> In function Read() the buffer can be got from the queue replacing trv = self.__ReadAndRetry(timeout) --> _ReadLoop() could use the ReadAndRetry() function, which has a minimum of overhead in normal operation.

Also, I would reduce logging to a minimum and define a separate flag in debug.py ANTlow = 0x100 (requiring to extend All to 0xffff) for the same reason to keep overhead to a minimum.

WouterJD commented 2 years ago

I will do a test with a USB-hub, bringing the dongles closer to the bike

I have done some testing;

When walking away from the dongles, communication is lost at 1.5 - 2m (heartrate no longer updated). The distance chest - laptop (A) is definitely in the critical area.

PS 1. While doing this test, #342 was raised and resolved; will be published later. PS 2. ExplorAnt.py is used to inspect received data from paired devices; ExplorAnt -s is used to simulate devices.

Two conclusions:

WouterJD commented 2 years ago

Using ExplorAnt I tried to generate many messages, shortening the 250ms loop. This was not successful, so flooding buffers was not a trivial job to verify full reception. My purpose was to see that the 900byte buffer would fill up.

switchabl commented 2 years ago

1.5m range seems rather short for ANT. But I guess the HRM may use a low transmit power to conserve the battery.

I am not sure you can actually exceed 900 bytes with a normal ANTUSB2/m (but easy to be safe and use 4096 bytes buffer like the Dynastream library). Overflowing the internal buffer may happen more easily though. The ANTUSB-m has a configurable "event buffer" with a maximum of 724 bytes but you have to turn that on explicitly. With the older models the assumption is always that you read continuously. I would assume that they can buffer at least 64 bytes (USB packet size) but maybe not more than that.

I do not think "overhead" in the read thread is an issue at all. Minimizing the time between transfers matters a lot if you need to maximize throughput (e.g. USB camera with high bandwidth). But with the ANT dongle, we are talking about kb/s. The extra thread is not really about performance, it is about avoid bad read patterns (with short timeouts) that cause race conditions. So my priority is clean, readable code, not on saving a ms here or there.

switchabl commented 2 years ago

@WouterJD Hope you are well and enjoyed the holidays. I've been rather busy before Christmas and then been away for a couple of days. I am still on it though and will finish it when I can.

WouterJD commented 2 years ago

Hi @switchabl best wishes for you too. I know you were busy (no Christmas holidays?) so now we can move forward again. Many people dust off their tacxes, find FortiusAnt and react on github. So all alive and kicking👍

WouterJD commented 2 years ago

Since there is no communication here, I have marked for future development.

WouterJD commented 1 year ago

ANTloop improved in current branch and working well. Will be published soon.