JMRI / JMRI

JMRI model railroad digital command & control software
https://www.jmri.org
Other
237 stars 331 forks source link

Z21 long command delays #6400

Closed Johan-Dahlberg closed 5 years ago

Johan-Dahlberg commented 5 years ago

Hi!

I experience long delays when using Z21 for loco control. When Preferences->Defaults are set to Z21LocoNet, only about 10 commands can be sent to the Z21 per second. When Preferences->Defaults are set to Z21 or Z21XpressNet, only a single command can be sent to the Z21 per second.

When moving a loco throttle from 0 to max and back to 0, the commands queue up creating huge delays. This can be observed with the Z21 traffic monitor. (Far more than a minute for the loco to come to a full stop again...)

I don't experience this with an old Lenz system...

Does anybody have any idea on what may be wrong? /Johan

Edit: Tried several versions up to and including JMRI 4.14

pabender commented 5 years ago

@Johan-Dahlberg I see delays when I first aquire a locomotive, but once I have control, I don’t see significant delays. I see similar delays when I ask for a locomotive using a handheld throttle on the XpressNet or LocoNet.

Here are some questions that might help us get to the bottom of this:

What firmware version is loaded in your Z21? How is your network configured? Is your PC connected via WiFi or a wired Ethernet connection?

If we can’t figure this out, I have some ideas for improving throughput, but they will take a little work on my part.

pabender commented 5 years ago

Also, do you see similar behavior from the Z21 apps?

pabender commented 5 years ago

One other thought, as I sit here playing with my Z21.... Do you have the "track slider in real time" option set on the throttle? If you do, turn that off, it generates a lot of extra traffic.

Johan-Dahlberg commented 5 years ago

Thanks for your response!

pabender commented 5 years ago

The track slider in real time is a choice on the speed control pane. You need to right click on ( or near ) the speed slider to get to it.

There is a new firmware version, 1.33 that was released last week, though that isn’t going to be the problem.

Johan-Dahlberg commented 5 years ago

Thanks, I found the option, which was not checked. Indeed when clicking on the slider, dragging and releasing only a single speed setting message is generated - and thus only one 1-second delay. But this doesn't solve the throughput problem I expect we will end up with on our club layout.

For testing I used the mouse wheel for moving the slider, which generates a lot of messages to be sent. If I remember correctly, this is also the case when changing speed using the smartphone volume button, which is my preferred control method.

I will try firmware 1.33 once it reaches Google play.

I don't really need an immediate quick fix. I just want to be sure the Z21 will be usable with JMRI within two-three months from now.

pabender commented 5 years ago

On Jan 8, 2019, at 3:34 PM, Johan-Dahlberg notifications@github.com wrote: For testing I used the mouse wheel for moving the slider, which generates a lot of messages to be sent. If I remember correctly, this is also the case when changing speed using the smartphone volume button, which is my preferred control method.

Ok, that explains it. The mouse wheel is basically doing the same thing the track slider in real time option does when you drag the mouse, I.e. it sends a speed change request for each movement.

I am surprised this wasn’t also an issue with a Lenz system. The Z21 throttle code really is the XpressNet throttle code ( though it listens for a couple of additional replies a ?Lenz command station doesn’t generate ).

I don't really need an immediate quick fix. I just want to be sure the Z21 will be usable with JMRI within two-three months from now.

Ok. This is really a different problem than I was expecting. It will take some research to figure out how to solve it.

Johan-Dahlberg commented 5 years ago

I can send you Wireshark dump files if that could give any clue.

pabender commented 5 years ago

@Johan-Dahlberg the underlying issue is that we are looking at a message response system which uses UDP/IP as a transport mechanism.

I was able to reproduce the issue using the track slider in real-time option.

There are really only three things we have any control over that can cause delays: 1) sending more messages that we must ( I.e. if the user has caused 3 speed change messages to be queued, we only really need to send the last one ). 2) outbound processing delays... I.e. how long does it take between when the user clicks a button and we send the message out on the wire. 3) how much delay is there inbound.

All that really boils down to making the code smarter about how it processes the messages.

Johan-Dahlberg commented 5 years ago

Sorry for the long delay...

I updated and tested the Z21 today. Nothing in the delay changed. I also tried JMRI 4.15.2, but I could not notice any improvement, as I think you already know.

Regarding your thoughts on delays:

  1. Yes of course sending fewer messages can help. This is of crucial importance for a large layout. However, in my test setup there is only a single loco and a single throttle to control it. That is, there should be enough bandwidth to send every message.

  2. and 3. There really should not be any inherent delay in the outbound nor in the inbound processing anywhere close to the 1-second delay I am experiencing.

This leads me to think that some timeout occurs in the message response system.

Maybe JMRI sends message M1 to the Z21, which immediately responds with response R1. JMRI does not recognize R1 properly and keeps waiting for the "correct" response. After one second JMRI gives up waiting for the response and instead sends message M2 and the procedure repeats.

/Johan

pabender commented 5 years ago

@Johan-Dahlberg I have a fix in #6566 that takes care of this.

The issue has to do with a difference in how most XpressNet command stations react to throttle commands vs how the Z21 reacts to throttle commands.

Not all of the standard XpressNet throttle response message include an address, so you can't have multiple messages outstanding at a single time.

The Z21 uses the same XpressNet commands for speed and direction as every other XpressNet command station, but it responds with a Z21 specific reply that does include an address.

It turns out the message queue was causing much of the delay involved. I relaxed the release conditions from the queue, which removes most of the 1 second delay between throttle commands. (it's down to a few milliseconds).

Johan-Dahlberg commented 5 years ago

I tried it out from the sources just now - and it works!!! Thanks a lot for locating and fixing the bug!

Finally I can start planning how to continue the signal system on the club layout...

/Johan