driplineorg / dripline-python

python implementation of project8/dripline
Apache License 2.0
3 stars 6 forks source link

self.sync_children in EthernetSCPIService returns empty list #134

Closed raphaelcervantes closed 3 years ago

raphaelcervantes commented 3 years ago

I edited the EthernetSCPIService so that it prints out it's endpoints at the end of it's __init__ method.

        self._reconnect()
        logger.info('printing endpoints')
        logger.info(self.sync_children)

But self.sync_children just returns an empty list. You can see the output toward the end of the pasted logs.

➜  orpheus_site_atlas git:(feature/muxer_temp_calibration) ✗ kubectl logs -f na-dripline-python-deployment-5487ff9756-6crbt
2021-05-26 21:34:53 [ PROG] i/application.cc(134): Final configuration:

{
    dripline : 
    {
        alerts-exchange : alerts
        broker : rabbitmq-app.default.svc.cluster.local
        heartbeat-interval-s : 60
        heartbeat-routing-key : heartbeat
        loop-timeout-ms : 1000
        max-payload-size : 10000
        message-wait-ms : 1000
        requests-exchange : requests
    }

    runtime-config : 
    {
        auth_file : /etc/rabbitmq-secret/authentications.json
        cmd_at_reconnect : 
        [

            SYST:ERRor?
            *OPC?
        ]

        command_terminator : 

        endpoints : 
        [

            {
                base_str : :SENSe:FREQuency:STARt
                calibration : {}
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_start_freq
            }

            {
                base_str : :SENSe:FREQuency:STOP
                calibration : {}
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_stop_freq
            }

            {
                base_str : :SENSe:FREQuency:CENTer
                calibration : {}
2021-05-26T21:34:53[INFO    ] dripline(90) -> python got a master config:
{'dripline': {'alerts-exchange': 'alerts', 'broker': 'rabbitmq-app.default.svc.cluster.local', 'heartbeat-interval-s': 60, 'heartbeat-routing-key': 'heartbeat', 'loop-timeout-ms': 1000, 'max-payload-size': 10000, 'message-wait-ms': 1000, 'requests-exchange': 'requests'}, 'runtime-config': {'auth_file': '/etc/rabbitmq-secret/authentications.json', 'cmd_at_reconnect': [None, 'SYST:ERRor?', '*OPC?'], 'command_terminator': '\n', 'endpoints': [{'base_str': ':SENSe:FREQuency:STARt', 'calibration': '{}', 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_start_freq'}, {'base_str': ':SENSe:FREQuency:STOP', 'calibration': '{}', 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_stop_freq'}, {'base_str': ':SENSe:FREQuency:CENTer', 'calibration': '{}', 'module': 'SimpleSCPIEntity', 'name': 'na_center_freq'}, {'base_str': ':SENSe:FREQuency:SPAN', 'calibration': '{}', 'module': 'SimpleSCPIEntity', 'name': 'na_freq_span'}, {'base_str': ':SOUR1:POW', 'calibration': '{}', 'get_on_set': True, 'log_on_set': True, 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_power'}, {'base_str': ':OUTPut', 'calibration': '{}', 'get_on_set': True, 'log_on_set': True, 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_output_enable'}, {'base_str': ':SENS:AVER:COUNT', 'calibration': '{}', 'get_on_set': True, 'log_on_set': True, 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_averages'}, {'base_str': ':SENS:AVER:STAT', 'calibration': '{}', 'get_on_set': True, 'log_on_set': True, 'log_routing_key_prefix': 'sensor_double_precision', 'module': 'SimpleSCPIEntity', 'name': 'na_average_enable'}, {'base_str': ':SENS1:SWE:POIN', 'calibration': '{}', 'get_on_set': True, 'log_on_set': True, 'module': 'SimpleSCPIEntity', 'name': 'na_sweep_points'}, {'base_str': ':SENS1:BAND:RES', 'calibration': '{}', 'get_on_set': True, 'log_interval': 600, 'log_on_set': True, 'module': 'SimpleSCPIEntity', 'name': 'na_if_band'}, {'calibration': '{}', 'get_str': ':TRIG:SOUR BUS;:TRIG:SING;:STAT:OPER:COND?', 'module': 'FormatEntity', 'name': 'na_trigger'}, {'calibration': 'False if (int({}) &  0b10000) else True', 'get_str': ':STAT:OPER:COND?', 'module': 'FormatEntity', 'name': 'na_sweep_finished'}, {'calibration': '[{}]', 'get_str': ':SENS1:FREQ:DATA?', 'module': 'FormatEntity', 'name': 'na_freq_data'}, {'module': 'FormatEntity', 'name': 'na_commands', 'set_str': '{}', 'set_value_map': {'autoscale': ':DISP:WIND1:TRAC1:Y:AUTO; *OPC?', 'clear_averages': ':SENS:AVER:CLEar; *OPC?'}}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 'na_s21_iq_data'}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR2:DEF S21; :CALC1:TRAC2:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 'na_s21_iq_data_trace2'}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR1:DEF S11; :CALC1:TRAC1:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 'na_s11_iq_data'}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR2:DEF S11; :CALC1:TRAC2:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 'na_s11_iq_data_trace2'}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 's21_iq_transmission_data'}, {'calibration': '[{}]', 'get_str': ':CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?', 'log_routing_key_prefix': 'sensor_double_precision_arr', 'module': 'FormatEntity', 'name': 's21_iq_reflection_data'}], 'module': 'EthernetSCPIService', 'name': 'na', 'reconnect_test': '+1', 'response_terminator': '\n', 'socket_info': '("192.168.25.7",5025)', 'socket_timeout': 3}}
                module : SimpleSCPIEntity
                name : na_center_freq
            }

            {
                base_str : :SENSe:FREQuency:SPAN
                calibration : {}
                module : SimpleSCPIEntity
                name : na_freq_span
            }

            {
                base_str : :SOUR1:POW
                calibration : {}
                get_on_set : true
                log_on_set : true
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_power
            }

            {
                base_str : :OUTPut
                calibration : {}
                get_on_set : true
                log_on_set : true
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_output_enable
            }

            {
                base_str : :SENS:AVER:COUNT
                calibration : {}
                get_on_set : true
                log_on_set : true
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_averages
            }

            {
                base_str : :SENS:AVER:STAT
                calibration : {}
                get_on_set : true
                log_on_set : true
                log_routing_key_prefix : sensor_double_precision
                module : SimpleSCPIEntity
                name : na_average_enable
            }

            {
                base_str : :SENS1:SWE:POIN
                calibration : {}
                get_on_set : true
                log_on_set : true
                module : SimpleSCPIEntity
                name : na_sweep_points
            }

            {
                base_str : :SENS1:BAND:RES
                calibration : {}
                get_on_set : true
                log_interval : 600
                log_on_set : true
                module : SimpleSCPIEntity
                name : na_if_band
            }
2021-05-26T21:34:53[DEBUG   ] dripline(36) -> creating object from:
{'auth_file': '/etc/rabbitmq-secret/authentications.json', 'cmd_at_reconnect': [None, 'SYST:ERRor?', '*OPC?'], 'command_terminator': '\n', 'module': 'EthernetSCPIService', 'name': 'na', 'reconnect_test': '+1', 'response_terminator': '\n', 'socket_info': '("192.168.25.7",5025)', 'socket_timeout': 3}
...
2021-05-26T21:34:53[DEBUG   ] dripline.implementations.ethernet_scpi_service(52) -> Formatting socket_info: ("192.168.25.7",5025)

            {
                calibration : {}
                get_str : :TRIG:SOUR BUS;:TRIG:SING;:STAT:OPER:COND?
                module : FormatEntity
                name : na_trigger
            }

            {
                calibration : False if (int({}) &  0b10000) else True
                get_str : :STAT:OPER:COND?
                module : FormatEntity
                name : na_sweep_finished
            }

            {
                calibration : [{}]
                get_str : :SENS1:FREQ:DATA?
                module : FormatEntity
                name : na_freq_data
            }

            {
                module : FormatEntity
                name : na_commands
                set_str : {}
                set_value_map : 
                {
                    autoscale : :DISP:WIND1:TRAC1:Y:AUTO; *OPC?
                    clear_averages : :SENS:AVER:CLEar; *OPC?
                }

            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : na_s21_iq_data
            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR2:DEF S21; :CALC1:TRAC2:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : na_s21_iq_data_trace2
            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR1:DEF S11; :CALC1:TRAC1:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : na_s11_iq_data
            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR2:DEF S11; :CALC1:TRAC2:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : na_s11_iq_data_trace2
            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : s21_iq_transmission_data
            }

            {
                calibration : [{}]
                get_str : :CALC1:PAR1:DEF S21; :CALC1:TRAC1:DATA:SDAT?
                log_routing_key_prefix : sensor_double_precision_arr
                module : FormatEntity
                name : s21_iq_reflection_data
            }

        ]

        module : EthernetSCPIService
        name : na
        reconnect_test : +1
        response_terminator : 

        socket_info : ("192.168.25.7",5025)
        socket_timeout : 3
    }

}

2021-05-26 21:34:53 [ PROG] i/application.cc(135): Ordered args:

[
]

2021-05-26T21:34:53[INFO    ] dripline.implementations.ethernet_scpi_service(88) -> Ethernet socket ('192.168.25.7', 5025) established
2021-05-26T21:34:53[DEBUG   ] dripline.implementations.ethernet_scpi_service(98) -> Emptying reconnect buffer
2021-05-26T21:34:56[WARNING ] dripline.implementations.ethernet_scpi_service(196) -> socket.timeout condition met; received:
''
2021-05-26T21:34:56[DEBUG   ] dripline.implementations.ethernet_scpi_service(200) -> ''
2021-05-26T21:34:56[DEBUG   ] dripline.implementations.ethernet_scpi_service(160) -> sending: b'SYST:ERRor?\n'
2021-05-26T21:34:56[DEBUG   ] dripline.implementations.ethernet_scpi_service(200) -> '+0,"No error"\n'
2021-05-26T21:34:56[INFO    ] dripline.implementations.ethernet_scpi_service(174) -> sync: 'SYST:ERRor?\n' -> '+0,"No error"'
2021-05-26T21:34:56[DEBUG   ] dripline.implementations.ethernet_scpi_service(160) -> sending: b'*OPC?\n'
2021-05-26T21:34:56[DEBUG   ] dripline.implementations.ethernet_scpi_service(200) -> '+1\n'
2021-05-26T21:34:56[INFO    ] dripline.implementations.ethernet_scpi_service(174) -> sync: '*OPC?\n' -> '+1'
2021-05-26T21:34:56[INFO    ] dripline.implementations.ethernet_scpi_service(73) -> printing endpoints
2021-05-26T21:34:56[INFO    ] dripline.implementations.ethernet_scpi_service(74) -> {}
2021-05-26T21:34:56[INFO    ] dripline(103) -> service built
nsoblath commented 3 years ago

My first attempt at replicating this problem with key-value-store failed. I was able to print out the endpoints, so I did not replicate the problem. Here's my modification to dl-service.py at line 114, just before calling the_service.start():

        set_endpoints = the_service.sync_children
        logger.info(set_endpoints)

And the output:

key-value-store_1     | 2021-06-21T23:23:00[INFO    ] dripline(89) -> python got a master config:
key-value-store_1     | {'dripline': {'alerts-exchange': 'alerts', 'heartbeat-interval-s': 60, 'heartbeat-routing-key': 'heartbeat', 'loop-timeout-ms': 1000, 'max-payload-size': 10000, 'message-wait-ms': 1000, 'requests-exchange': 'requests'}, 'runtime-config': {'auth_file': '/root/authentications.json', 'endpoints': [{'calibration': '2*{}', 'get_on_set': True, 'initial_value': 0.75, 'log_interval': 10, 'log_on_set': True, 'module': 'KeyValueStore', 'name': 'peaches'}, {'calibration': 'times3({})', 'initial_value': 1.75, 'module': 'KeyValueStore', 'name': 'chips'}, {'calibration': '1.*{}', 'initial_value': 4.0, 'module': 'KeyValueStore', 'name': 'waffles'}], 'module': 'Service', 'name': 'my_store'}}
key-value-store_1     | 2021-06-21T23:23:00[INFO    ] dripline(102) -> service built
key-value-store_1     | 2021-06-21T23:23:00[INFO    ] dripline.core.entity(150) -> should start logging every 0:00:10
key-value-store_1     | 2021-06-21T23:23:00[INFO    ] dripline(113) -> about to start the service
key-value-store_1     | 2021-06-21T23:23:00[INFO    ] dripline(115) -> {'chips': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f7d67acf360>, 'peaches': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f7d6748b4a0>, 'waffles': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f7d6748b540>}

The service died with a setfault immediately after that, so there's still something strange going on, but at least so far I haven't any issue with getting the endpoints.

raphaelcervantes commented 3 years ago

weird.

nsoblath commented 3 years ago

Tangent to this issue regarding the segfault I mentioned just above: it appears to be something similar to what I'd seen before when testing the receive-timeout fix using the Interface class. I got the service to run in GDB like this:

    command:
      - gdb
      - --batch
      - --ex
      - run
      - --ex
      - bt
      - --args
      - python
      - /usr/local/bin/dl-serve
      - -c
      - /root/key-value-store.yaml
      - -v

I see this at the top of the stack trace:

key-value-store_1     | 2021-06-21T23:41:10[INFO    ] dripline(102) -> service built
key-value-store_1     | 2021-06-21T23:41:10[INFO    ] dripline.core.entity(150) -> should start logging every 0:00:10
key-value-store_1     | 2021-06-21T23:41:10[INFO    ] dripline(113) -> about to start the service
key-value-store_1     | 2021-06-21T23:41:10[INFO    ] dripline(115) -> {'chips': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f1e20480450>, 'peaches': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f1e1fe365e0>, 'waffles': <dripline.implementations.key_value_store.KeyValueStore object at 0x7f1e1fe362c0>}
key-value-store_1     | 402 Objects/methodobject.c: No such file or directory.
key-value-store_1     | 
key-value-store_1     | Program received signal SIGSEGV, Segmentation fault.
key-value-store_1     | 0x00007f1e2264d138 in cfunction_enter_call (func=0x7f1e20615900) at Objects/methodobject.c:402
key-value-store_1     | #0  0x00007f1e2264d138 in cfunction_enter_call (func=0x7f1e20615900) at Objects/methodobject.c:402
key-value-store_1     | #1  cfunction_vectorcall_O (func=func@entry=0x7f1e20615900, args=0x7f1e21b0b208, nargsf=nargsf@entry=1, kwnames=0x0) at Objects/methodobject.c:478
key-value-store_1     | #2  0x00007f1e22620e7d in PyVectorcall_Call (callable=0x7f1e20615900, tuple=<optimized out>, kwargs=<optimized out>) at Objects/call.c:200
key-value-store_1     | #3  0x00007f1e2128f999 in ?? () from /usr/local/lib/python3.8/site-packages/_dripline.cpython-38-x86_64-linux-gnu.so
key-value-store_1     | #4  0x00007f1e212a0cbd in ?? () from /usr/local/lib/python3.8/site-packages/_dripline.cpython-38-x86_64-linux-gnu.so
key-value-store_1     | #5  0x00007f1e21406cde in std::ostream::flush() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
key-value-store_1     | #6  0x00007f1e2164087a in scarab::logger::Private::logCout (this=0x557404ee9940, level=scarab::logger::ELevel::eInfo, message="Connecting to <rabbit-broker:5672>", loc=...) at /usr/local/src/scarab/library/logger/logger.cc:175
key-value-store_1     | #7  0x00007f1e216400d1 in scarab::logger::Log (this=0x7f1e21245400 <_ZN8driplineL4dlogE>, level=scarab::logger::ELevel::eInfo, message="Connecting to <rabbit-broker:5672>", loc=...) at /usr/local/src/scarab/library/logger/logger.cc:346
key-value-store_1     | #8  0x00007f1e211c822d in dripline::service::start (this=0x5574053ab950) at /usr/local/src/library/service.cc:174
key-value-store_1     | #9  0x00007f1e2127bcff in ?? () from /usr/local/lib/python3.8/site-packages/_dripline.cpython-38-x86_64-linux-gnu.so
raphaelcervantes commented 3 years ago

So it looks like it just can't find a file, right?

nsoblath commented 3 years ago

So it would seem, but I'm not sure what it means, in this case. The error is coming via one particular logging line. When I encounter what appears to be the same problem using the Interface class, it looks similar, but a different origin line from the C++. In either case, though, by the time this happens its already printed a whole lot to the C++ logger. So I don't know what's unique about these lines, or what sort of situation the application has gotten itself into such that it finally crashes.

nsoblath commented 3 years ago

The segfault was fixed a few versions ago (it was a python binding problem), and I believe we've determined that the empty list of synchronous children is due to the lack of startup commands, which is addressed in #136.

I'm going to close this issue. If there's something that needs to be addressed that's not in other issues or already solved, please reopen.