mdavidsaver / pvxs

PVA protocol client/server library and utilities.
https://mdavidsaver.github.io/pvxs/
Other
19 stars 25 forks source link

Problem when monitoring NT PV with value field as "any" #38

Closed joaopaulosm closed 1 year ago

joaopaulosm commented 1 year ago

I have a test program (IOC) that monitors a set of NT PVs which are made of many structures and a "value" field that is set to "any" (the PVs were created using QSRV info tags). I noticed that when moving from PVXS 1.0.1 to 1.1.3 the monitor event crashed, yielding the following stack trace in the IOC console:

 LabS-ICS:SC-IOC-414:TstGrp Connected to 172.30.4.228:42204
2023-03-28T23:33:36.766608666 CRIT pvxs.tcp.io Server Error while processing cmd 0x0d: No such field
Dumping a stack trace of thread 'PVXCTCP':
[    0x7f844f2ba84b]: /opt/epics-vanilla/base-7.0.7/lib/linux-x86_64/libCom.so.3.22.0(epicsStackTrace+0x4b)
[    0x7f844f5fdf8a]: /opt/epics-vanilla/pvxs-1.1.3/lib/linux-x86_64/libpvxs.so.1.1(_ZN4pvxs6detailL12_log_vprintfEjPKcP13__va_list_tag+0x84)
[    0x7f844f5fe034]: /opt/epics-vanilla/pvxs-1.1.3/lib/linux-x86_64/libpvxs.so.1.1(_ZN4pvxs6detail11_log_printfEjPKcz+0xa3)
[    0x7f844f67fcdb]: /opt/epics-vanilla/pvxs-1.1.3/lib/linux-x86_64/libpvxs.so.1.1(_ZN4pvxs4impl8ConnBase7bevReadEv+0xb11)
[    0x7f844f67fef7]: /opt/epics-vanilla/pvxs-1.1.3/lib/linux-x86_64/libpvxs.so.1.1(_ZN4pvxs4impl8ConnBase8bevReadSEP11buffereventPv+0x51)
[    0x7f844f05ef7c]: /lib64/libevent_core-2.0.so.5(_bufferevent_decref_and_unlock+0x23c)
[    0x7f844f055495]: /lib64/libevent_core-2.0.so.5(event_base_loop+0x865)
[    0x7f844f65894e]: /opt/epics-vanilla/pvxs-1.1.3/lib/linux-x86_64/libpvxs.so.1.1(_ZN4pvxs4impl6evbase3Pvt3runEv+0x354)
[    0x7f844f2af759]: /opt/epics-vanilla/base-7.0.7/lib/linux-x86_64/libCom.so.3.22.0(epicsThreadCallEntryPoint+0x69)
[    0x7f844f2b45ea]: /opt/epics-vanilla/base-7.0.7/lib/linux-x86_64/libCom.so.3.22.0(start_routine+0xda)
[    0x7f844e345ea5]: /lib64/libpthread.so.0(start_thread+0xc5)
[    0x7f844e85bb0d]: /lib64/libc.so.6(clone+0x6d)
LabS-ICS:SC-IOC-414:TstGrp Disconnected

I tested again with PVXS 1.0.1 and everything was OK. Then I started testing with pvxmonitor from PVXS >= 1.0.1 and managed to reproduce the same problem.

To Reproduce Steps to reproduce the behavior:

  1. Create a simple IOC with a "group" PV, something like the database below:

    
    record(calc, "$(P)$(R)$(NAME)A") {
    field(CALC, "A+1")
    field(INPA, "$(P)$(R)$(NAME)A")
    field(SCAN, ".5 second")
    
    info(Q:group, {
        "$(P)$(R)$(NAME)Grp": {
            "A": {+type:"any", +channel:"VAL"}
        }
    })
    }

record(calc, "$(P)$(R)$(NAME)B") { field(CALC, "A+1") field(INPA, "$(P)$(R)$(NAME)B") field(SCAN, "1 second")

info(Q:group, {
    "$(P)$(R)$(NAME)Grp": {
        "B": {+type:"any", +channel:"VAL"}
    }
})

}

2. Try to monitor the group PV using `pvxmonitor` from PVXS 1.0.1, 1.1.0 onwards...
3. PVXS 1.0.1 monitor should result:

FOO:BAR:PVNAME Connected to 192.168.0.1:PORTNUM FOO:BAR:PVNAME struct record struct record._options struct record._options.queueSize uint32_t = 0 record._options.atomic bool = false A any A-> double = [some number] B any B-> double = [some number]


PVXS 1.1.0 monitor breaks with the following error:

CRIT pvxs.tcp.io Server Error while processing cmd 0x0d: No such field



The same happens for PVXS > 1.1.0

**Information (please complete the following):**
 - PVXS Version or Git commit ID: tested with all releases from 1.0.1 (until 1.1.3). The problem shows up in 1.1.0
 - EPICS Base Version: 7.0.7
 - libevent Version: libevent 2.0.21-stable
 - EPICS_HOST_ARCH: linux-x86_64
 - Host OS: CentOS 7
 - Compiler version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)

**Additional context**
If, in the given example of group PV, we change the info tag of the group member from ` {+type:"any", +channel:"VAL"}` to ` { +channel:"VAL"}`, the problem disappears. 

Michael, any guidance on how I can be helpful is appreciated. 
mdavidsaver commented 1 year ago

Michael, any guidance on how I can be helpful is appreciated.

As it happens, I may have fixed this last week with 92fb0a4afa5054b6cc3e8eb395facaa337d1c454. Are you able to test with the master branch?

joaopaulosm commented 1 year ago

I was hoping to go to sleep and wake up with the fix ready, but this time you surpassed all expectations by fixing it BEFORE the issue was reported =D

Jokes aside, yes, I did test with the master branch and it fixed the issue completely. Thanks again, Michael!

mdavidsaver commented 1 year ago

Ok :) I still might wish to have noticed this issue a few hours before making the 1.1.3 release, instead it was a few hours afterwards. A 1.1.4 with this fix will likely happen within the next few days.

mdavidsaver commented 1 year ago

Included in 1.1.4.

juanfem commented 1 year ago

I was going to report this issue now and found out that I just missed the last update... Thanks!