JMRI Turnout behavior may be highlighting OpenLCB Java bug

bobjacobsen commented 7 years ago

The following is from a JMRIusers discussion on JMRI's behavior when setting something (e.g. the Producer/Consumer pair that implements a JMRI Turnout or Sensor) to the same state as it already has.

If a Logix or LRoute requests a specific LCC Turnout state no message is sent on the LCC network if the Turnout Table already indicates that state. By contrast Loconet Turnouts will unconditionally send the requested Turnout message. This behavior of the LCC Turnout Table makes authoritative initialization impossible and subsequent Routes may have errors. Using jmri 47.2 and 1.8.0 java.

The usual structure of the *Turnout classes has the forwardCommandChangeToLayout(..) routine take the new commanded state (e.g. THROWN or CLOSED) and handle communications with the on-layout hardware. For the LocoNet, it’s here:

https://github.com/JMRI/JMRI/blob/master/java/src/jmri/jmrix/loconet/LnTurnout.java#L112

and always calls sendOpcSwReqMessage(..); (see line 115) which actually formats and sends the LocoNet messages (see line 143).

For OpenLCB, the corresponding JMRI code is here:

https://github.com/JMRI/JMRI/blob/master/java/src/jmri/jmrix/openlcb/OlcbTurnout.java#L118

It’s structured a bit differently than the LocoNet version, but it doesn’t look at the prior state before invoking underlying code in the org.openlcb library via a setFromOwner(..) call on a org.openlcb.implementations.VersionedValueListener object. That call is always made, so if it’s sending sometimes and not others, I think that decision must be made in the org.openlcb code.

This looks like a problem with the OpenLCB library, not JMRI.

balazsracz commented 7 years ago

Is there a situation in which that line should not send an event to the bus? In other words, is there de-duplication happening inside JMRI somewhere in a layer above this? Or does JMRI maybe not care about duplication at all, and just send messages under all circumstances, even if the last state is already matching the new commanded state?

bobjacobsen commented 7 years ago

Note: I haven't recreated the original problem request with real hardware. I no longer have any.

Is there a situation in which that line should not send an event to the bus? In other words, is there de-duplication happening inside JMRI somewhere in a layer above this?

Shouldn't be. I haven't confirmed that, bugs happen, but there shouldn't be.

Or does JMRI maybe not care about duplication at all, and just send messages under all circumstances, even if the last state is already matching the new commanded state?

If there is de-duplication happening, that's system-specific. Each system connection package does the final implementation for itself. Most (including jmri.openlcb) use the standard abstract base classes, which don't do any de-duplication.

You can check this by creating a test turnout, call it "test" for a user name, then execute the python script turnouts.getTurnout("test").setState(CLOSED) multiple times. With most systems, that'll send multiple times. With OpenLCB (tested in simulator mode) it calls forwardCommandChangeToLayout(..) properly every time you run it, but messages are only emitted when there's a change of state.

OpenLCB (used to be) based on a distributed state model, where events serve as transition notifications across a network of equals. As part of that, JMRI's Sensors and Turnouts listen for OpenLCB messages that'd drive state transitions, and send OpenLCB messages when requested.

verifywriteprogrammerfacade-state-diagram

balazsracz commented 7 years ago

I wrote code for de-duplication inside the OpenLCB library. That would prevent events being emitted and callbacks being invoked for the loop edges on the quoted diagram. This means I need to implement a feature that allows disabling the de-duplication code. That sounds doable.

JMRI would want to permanently disable de-duplication?

bobjacobsen commented 7 years ago

What was the reason for the de-duplication?

Haven't gone back to check, but it seems to me that repetitive transitions (e.g. production of "duplicate" Produced Event messages) was a feature of the OpenLCB distributed-state model.

IIRC, there was even an example of the need around night and day vs all-on and all-off lighting controls, though perhaps I've misremembered something.

JMRI would want to permanently disable de-duplication?

"JMRI" as such, e.g. the JMRI code, doesn't care. It doesn't require anything in this area. It's just trying to implement the semantics of each system that it connects to. The place to sort out the semantics for an OpenLCB turnout might be wherever the discussions were that resulted in David Parks raising the original question on JMRIusers.

balazsracz commented 7 years ago

There was no change in the OpenLCB library. JMRI turnouts and sensors have not previously used the library at all [1]; that's why there were so many standards compliancy issues with them [1]. Switching to the library meant that the "network state" of the turnout was also managed by the library, and part of this state management is a thread-safe transactional state object supporting callbacks. This state object has always deduplicated updates.

I understand if JMRI wants to not do that and will add a new feature for it.

I'm not sure that the advanced examples of PC model such as night-dawn-day et al are well suited to be represented in JMRI with turnout objects. One problem is that there is no UI for a turnout to emit the same request as it had previously, but turnout UIs (e.g. panel, turnouts table) cycle through the two possible states. So an end user would be forced to toggle back and forth (which, incidentally, is perfectly well supported today). Much rather a single-event sensor object should be used which sets the sensor state back to inactive after producing the event. This is fully supported by the library and current implementation (as well as covered by tests).

[1] https://github.com/JMRI/JMRI/commit/0cca9543a5655381c2ebf134408da5ba8edf2c71

bobjacobsen commented 7 years ago

Thanks in advance for working on this.

As to the rest, let's just leave it that I disagree on the various assertions about JMRI's capabilities and how it's used. There's more there than you describe.

balazsracz commented 7 years ago

@bobjacobsen does this issue also impact sensor objects in JMRI? They share the codepath in the OpenLCB library, wondering if I need to change them both.

balazsracz commented 7 years ago

This is fixed in JMRI head now.

openlcb / OpenLCB_Java

JMRI Turnout behavior may be highlighting OpenLCB Java bug #80