EdgeTX / edgetx

EdgeTX is the cutting edge open source firmware for your R/C radio
https://edgetx.org
GNU General Public License v2.0
1.62k stars 341 forks source link

SBUS trainer lags after a while (SBUS-inverted signal from ER-6 onto TX16S) #5646

Open dirteat opened 3 weeks ago

dirteat commented 3 weeks ago

Is there an existing issue for this problem?

What part of EdgeTX is the focus of this bug?

Transmitter firmware

Current Behavior

I have set up a trainer mode to connect 2 TX16S emitters. For this I am using:

Switching trainer mode on the TX16S_Teacher, I am receiving the input of the TX16S_Learner, first without issues (Chan or Stick). But after a while, tens of seconds, maybe a minute or two, the input start to lag, crazily lags (would be catastrophic while flying).

I've tried two different ER-6 receivers, same symptoms. What seems to not lead to catastrophic lag is to set the Packet Rate to 333Hz (full) on both the TX16_Learner and the ER-6!?? But in that situation, the controlled surfaces on the plane as ordered by the TX16S_Learner through the trainer TX16_Teacher do not move really smoothly, they seem to jump by little steps.

Let me know if you want me to do other tests. But, for the time being, I am not going to teach with the setup :)

NB: All these guys are running exactly the same firmwares and softwares, namely: edgeTX 2.10.5, expresslrs 3.5.1

Expected Behavior

I would expect the input to remain steadily smooth at all times!

Steps To Reproduce

See description above.

Version

2.10.5

Transmitter

RadioMaster TX16S / TX16SMK2

Operating System (OS)

No response

OS Version

No response

Anything else?

I love you guys!

3djc commented 2 weeks ago

Cannot reproduce. I have it running for over an hour, still silky smooth.

dirteat commented 2 weeks ago

Damned. Same hardware, same software, same receiver?

3djc commented 2 weeks ago

Roughly yes, I'm just on elrs 3.5.0. Where is your lag occurring? How is the output widget doing ? Is is jerky there too ?

dirteat commented 2 weeks ago

Yes, the widget output, then showing the input from TX_learner, is visibly laggy as well!

There is only one thing I have not tested twice, the cable connecting the ER-6 to the TX_teacher (I've checked with a spare ER-6, but I used the same cable to connect it to AUX1). I'll test that and report...

mha1 commented 2 weeks ago

Try these setting:

philmoz commented 2 weeks ago

I tried it as well and could not reproduce the issue (TX16S for master, T20V2 for slave).

Can you post all of the settings for the ER6:

55Laurent commented 2 weeks ago

for your info, perhaps no link with above pb, but I got also strange lag for pupil inputs. Setup is the foowing: betafpv nano receiver in inverted sbus mode connected to aux1 master tx. Pupil tx elrs RM boxer. Master tx (with external ranger elrs module) to model with elrs er6 receiver. all set to 100hz full. 16/2channels. 1 password for master/model link + model match number and another password for pupil/master link with model match set to off. Dyn power for master/model, and not 100% fully sure but 10mW for pupil tx, betafpv nano telem on 10mW. In this case at start up, there is a period of time (about 30-60s) with huge lags, then after a while no more lag. If adding model match number for pupil/master link, there is no more lag.
Perhaps more an elrs issue than edgetx I think.

3djc commented 2 weeks ago

Possibly unrelated too, but turning off telemetry on sbus inverted receiver is likely a good thing to do in general, especially if somehow fitted in case or close to radio main board

55Laurent commented 2 weeks ago

ok, thanks, I'll try. But with model match set, there is absolutly no issue even with both tx in close contact...

3djc commented 2 weeks ago

That is more for interactions between the 'trainee' receiver and the master radio in very close proximity

55Laurent commented 2 weeks ago

understood, this why I set the telem power to the min value, and I found really no issue, whatever the positions of the 2 tx (I installed the buddy receiver is inside the master TX16S). But I'll try with buddy telem off as you said. Then the only issue I got is when there is no model match number, the binding between pupil tx and buddy receiver is ok, but there are very big lags and in take almost 1mn to disapear. All firmwares are up to date.

Gerold68 commented 2 weeks ago

I opened an issue 2 years ago for this strange behavior. Seems it was never solved completely.

3djc commented 2 weeks ago

Understood @55Laurent that was just a generic and general comment. Can't seem to replicate, but as weird as it sounds, it does indeed sound elrs given what you say

dirteat commented 1 week ago

Thank you all for your input and feedback.

I have changed the cable connecting the ER6 to the TX_Teacher, and I still do observe the lagging getting in after a while. This time, I went on playing with the commands even though they were lagging, and the lag disappears after another while, to come back later on.

Importantly, this happens only at 100Hz (full) between TX_Learner and ER6; at 333Hz (full), this is not happening. I have also tried various power output for the TX_Trainer and TX_Learner, as suggested, and the power output does not change anything (final setting 10mW).

Also, this "seems" correlated to a loss of signal as, indeed, I have telemetry on between TX_Trainer and ER-6, precisely to monitor the radiolink. When "telemetry loss" pops out, then, very likely the lagging starts a few seconds afterwards ( not 100% of the time though).

NB: I have made a video showing the lag, not sure if I can post it there, but I'll try in another comment. NB2: I don't have model match on (I am quite new to EdgeTX and I have not tried yet), just two passwords (one for the link TX_Learner <-> ER6, another one for TX_Teacher <-> plane). NB3: Most likely unrelated, but, in case it rings a bell, let me say that without passwords for bindings, something really nasty occurred. When the bind between TX_Learner and ER6 disappeared from time to time (signal loss), I did observe an incorrect rebind, namely, between the ER6 and TX_Teacher (no typo)! This creates a positive feedback loop and all controls were increasing to saturation.

dirteat commented 1 week ago

Ok, the video is too big (60MB), here a link to my personal website:

sbus_lagtest.webm

The video starts while the lag is already in progress, I am toying for a while showing it. Then, I decide to power off and on the ER-6, the bind between the TX_Learner and ER6 is auto-magically restored, but the lag is still there, and, finally disappears after ten more seconds or so (I insist that the lag can also disappear without resetting the ER-6).

3djc commented 1 week ago

I have checked the signal getting out of ER6 when trainer radio is set to 100Hz or 333Hz. There is a small period difference (10ms vs 7ms), but nothing that could disrupt ETX.

There is a question on how on earth you can have "telemetry lost" between trainee radio and the ER-6. That should never happen at that range.

Also the fact that ER-6 rebinds to master tx absolutely makes no sense. At this point in time, if problem isn't an issues with your settings, I think it point toward ELRS issues, not ETX

dirteat commented 1 week ago
There is a question on how on earth you can have "telemetry lost" between trainee radio and the ER-6. That should never happen at that range

It is actually something I observed even without doing this setup, between my TX and the plane, when both are very close... In flight, this never happens. Anyway, yes, that points more to ELRS than EdgeTX indeed. I should probably report the issue there :-/

    Internal / External RF settings on the slave radio
    ELRS config lua script settings on the slave radio

Please, let me know where to find them (name, directory...). Thanks.

mha1 commented 1 week ago

@3djc I was able to reproduce this and checked the ELRS side. SBUS data delivered at AUX1 looks good and smooth while the problem can be observed looking at a jerky lagging channel monitor on the master radio. See my comment at the twin ELRS issue.

To me it now looks like a sampling problem.

mha1 commented 1 week ago

@3djc to support the timing theory some side notes. ELRS attempts to output SBUS frames at a 9ms period, 50Hz and 100Hz packet rates won't be able to achieve this as fresh channel data can only be delivered at 20ms (50Hz) and 10ms (100Hz). Selecting 100Hz packet rate will yield a 10ms period. This is exactly the mixer task period +- some minor jitter. For testing I fixed the SBUS period at 10ms. EdgeTX responded to this with random and sometimes longer periods of "trainer lost" while I was still observing SBUS frames arriving at AUX1. My suspicion is if the SBUS frame rate matches the mixer task period very accurately the SBUS data acquisition fails. The incoming SBUS jitter makes it look random. Fixing the period at 10ms made it worse by matching the mixer period more often. I can imagine a situation where every time the mixer tasks tries to fetch SBUS data an incoming SBUS frame is just processed meaning lost for this cycle. Double buffering should solve this. This would also explain why slave radio packet rates >100Hz, e.g. 333Hz wont show the problem, but using 333Hz doesn't fix the problem, it just masks it. Hope that helps.

gagarinlg commented 1 week ago

Maybe we should to what FrSky did and start a clean sheet implementation of EdgeTX for Version 4

mha1 commented 1 week ago

Maybe we should to what FrSky did and start a clean sheet implementation of EdgeTX for Version 4

Quite the radical fix for this issue 😜

3djc commented 1 week ago

Indeed a bit radical ! It is difficult currently for me to work on this, but i want to try to understand if the fail is in the capture itself or decoding of the sbus stream. Out of curiosity, at what baudrate is your internal module set to ?

mha1 commented 1 week ago

both slave and master radios are on 921k (my go to setting)

3djc commented 5 days ago

@mha1 could you test with this binary (beware, it is based on main, not 2.10). I don't seem to be able to reproduce with it

firmware.bin.zip

mha1 commented 5 days ago

@mha1 could you test with this binary (beware, it is based on main, not 2.10). I don't seem to be able to reproduce with it

firmware.bin.zip

Raphael's change? Looking good. I tested various packet rate combinations on the salve and mnaster radio (even slave radio at 50Hz and F1000) and I also tested this with a fixed 10ms SBUS rate vs 100Hz packet rate on the master radio (SBUS period = Mixer period) which previously resulted in a lot of trainer lost messages. All looking good and I have the impression it's more lively and smoother than it ever was. Wolud be nice if @dirteat could confirm this in his environment.

3djc commented 5 days ago

Yes it is smoother as we read as much as available (on serial idle)

mha1 commented 5 days ago

Does it fit 2.10 too?