dnp3 / opendnp3

DNP3 (IEEE-1815) protocol stack. Modern C++ with bindings for .NET and Java.
https://dnp3.github.io
Apache License 2.0
300 stars 231 forks source link

Unsolicited enabled and multiple outstations on single connection leads to occasional deadlocks on startup. #410

Closed txjmb closed 2 years ago

txjmb commented 3 years ago

I have an outstation application that is presenting multiple outstation instances on a single connection. Fairly frequently, the app will lock up after enabling the second outstation (of many). On a whim, I changed the "allowUnsolicited" flag on the stack config to false, and I haven't had the problem happen since. One thing that is clear in the logs is that when the master connects after enabling the first outstation, it gets a series of "Frame w/ unknown route" messages (it's polling for all of the outstations, and all of them haven't been enabled at this point yet), so it may be that there's a deadlock on the server/outstation handshake when this is the scenario.

Let me know if you have any additional questions. I am using OpenDNP3 through a Python wrapper, but I can try to make a self-contained repro if it is necessary.

channel state change: OPEN
ms(1608763953606) --AL->  outstation_1 - F0 82 80 00
ms(1608763953606) --AL->  outstation_1 - FIR: 1 FIN: 1 CON: 1 UNS: 1 SEQ: 0 FUNC: UNSOLICITED_RESPONSE IIN: [0x80, 0x00]
ms(1608763953606) --TL->  outstation_1 - FIR: 1 FIN: 1 SEQ: 0 LEN: 4
ms(1608763953606) --LL->  outstation_1 - Function: PRI_UNCONFIRMED_USER_DATA Dest: 10 Source: 1 Length: 5
ms(1608763953606) --LL->  outstation_1 - 05 64 0A 44 0A 00 01 00 3F 05
ms(1608763953606) --LL->  outstation_1 - C0 F0 82 80 00 6B 7D
ms(1608763953607) <-LL--  server - Function: PRI_UNCONFIRMED_USER_DATA Dest: 11 Source: 10 Length: 20
ms(1608763953607) <-LL--  server - 05 64 14 C4 0B 00 0A 00 58 F0
ms(1608763953607) <-LL--  server - C0 CE 01 3C 02 06 3C 03 06 3C 04 06 3C 01 06 53 6D
ms(1608763953607) WARN    server - Frame w/ unknown route, source: 10, dest 11
ms(1608763953607) <-LL--  server - Function: PRI_UNCONFIRMED_USER_DATA Dest: 1 Source: 10 Length: 8
ms(1608763953607) <-LL--  server - 05 64 08 C4 01 00 0A 00 AD 62
ms(1608763953607) <-LL--  server - C0 D0 00 1B 49
ms(1608763953607) <-TL--  outstation_1 - FIR: 1 FIN: 1 SEQ: 0 LEN: 2
ms(1608763953607) <-AL--  outstation_1 - D0 00
ms(1608763953607) <-AL--  outstation_1 - FIR: 1 FIN: 1 CON: 0 UNS: 1 SEQ: 0 FUNC: CONFIRM
2020-12-23 16:52:35,437 __main__        DEBUG   Enabling the outstation outstation_1. Traffic will now start to flow.

After the second outstation is enabled, the deadlock occurs.

jadamcrain commented 3 years ago

Would need a sample in C++ or one of the language bindings we officially support. The python wrapper is unaffiliated.

txjmb commented 3 years ago

Makes sense. I'll work on creating a repro in Java or C++.

jadamcrain commented 3 years ago

Just as an FYI, in case DNP3 + Python is critical to a long-term project or product:

https://stepfunc.io/blog/opendnp3-retrospective/

Our new technology stack is Rust -> C ABI -> language bindings.

Language bindings have been a massive pain-point for us with OpenDNP3, and we're addressing the going forward with a universal binding generator:

https://stepfunc.io/blog/bindings/

We currently support C headers, .NET Core via PInvoke, and Java via JNI. Python has been the next most commonly requested language binding. Writing a Python backend generator would give us Python support for all our libraries.