Hurricos / realtek-poe

5 stars 10 forks source link

realtek-poe: Implement a retry mechanism for failed commands #15

Closed mrnuke closed 2 years ago

mrnuke commented 2 years ago

The MCU doesn't like receiving a flood of commands. If it receives a large number of commands in rapid succession, it may refuse to process them. The problem is intermittent, but can be identified by receiving:

 fd ff ff ff ff ff ff ff ff ff ff f3

Testing shows that the MCU sends this packet several times, even if it receives no further commands. There is no "I am ready" packet, which makes this shituation more cumbersome to get out of.

A simple mechanism is to wait a certain amount of time for the MCU to recover, and then start sending commands again. Keep the current command on the queue, and only resend it after this timeout expires. The hope is that this gives the MCU enough time to recuperate.

The current implementation recycles the sequence number when sending retries. That could be problematic in matching the reply and request. However, in practice this has not shown to be a problem.

Implement a 5 retry policy with a 100ms timeout for each failure. This waits about half a second before giving up on the current command and sending the next command in the queue.

mrnuke commented 2 years ago

If this cough fix cough works, I'm expecting this to address #9 and #13, as well as #10 for the most part. I've also noticed a de-synchronized packet issue in #10, which is not addressed here.

Hurricos commented 2 years ago

OK, I like this! After testing, I noticed that I was overall only getting a small handful of MCU error messages, and I had to go read the source to find out.

I really think the 100ms pause is a great idea, if the MCU is going to complain we shouldn't be hammering on it. I like how this is implemented.

Rebasing and merging!