tango-controls / cppTango

Moved to gitlab
http://tango-controls.org
41 stars 34 forks source link

Limit of six device classes in a device server - FileDatabase::DbGetClassProperty bug #816

Closed DrewDevereux closed 2 years ago

DrewDevereux commented 3 years ago

After experiencing "unidentifiable C++ exception" when trying to add a device to pytango's DeviceTestContext, I dug into the code, and eventually found that a device server can have any number of device instances of up to six device classes. But as soon as a seventh device class is added, the device server cannot run. The following python code illustrates this problem. The device server runs successfully if include_seventh_device is False. If set to True, then the device server fails to run.

from tango import Database
from tango.server import Device, run
from tango.test_context import get_host_ip

class A(Device): pass
class B(Device): pass
class C(Device): pass
class D(Device): pass
class E(Device): pass
class F(Device): pass
class G(Device): pass

# Write device info to text file database
with open("db_file", "w") as f:
    f.write("Debug/debug/DEVICE/A:test/a/1\n")
    f.write("Debug/debug/DEVICE/B:test/b/1\n")
    f.write("Debug/debug/DEVICE/C:test/c/1\n")
    f.write("Debug/debug/DEVICE/D:test/d/1\n")
    f.write("Debug/debug/DEVICE/E:test/e/1\n")
    f.write("Debug/debug/DEVICE/F:test/f/1\n")
    f.write("Debug/debug/DEVICE/G:test/g/1\n")

include_seventh_device = False

# run server
run(
    [A, B, C, D, E, F] + [G]*include_seventh_device,
    args=[
        "Debug",
        "debug",
        "-ORBendPoint",
        f"giop:tcp:{get_host_ip()}:57061",
        "-file=db_file"
    ]
)
t-b commented 3 years ago

I dug into the code, and eventually found that a device server can have any number of device instances of up to six device classes.

Do you have a link/code place to look at?

ajoubertza commented 3 years ago

If I change the example slightly, I can see a little more detail. Exception from the PyTango extension (boost) layer. I haven't narrowed it down any further yet.

Change (-v5 and raises):

run(
    [A, B, C, D, E, F] + [G] * include_seventh_device,
    args=[
        'Debug',
        'debug',
        '-ORBendPoint', f'giop:tcp:{get_host_ip()}:57061',
        f'-file=db_file',
        '-v 5',
    ],
    raises=True,
)
(env-py3.8-tango9.3.4) root@db205f206a90:/opt/project/examples/tmp# python six-limit.py 
Entering Logging::init
    TANGO_LOG_PATH is /tmp/tango-root
    cmd line logging level is 5
    Logging::create_log_dir(/tmp/tango-root/Debug/debug) returned -1
    added console target (logging level set from cmd line)
Leaving Logging::init
1605878005 [140090933802816] DEBUG dserver/Debug/debug Connected to database
1605878005 [140090933802816] DEBUG dserver/Debug/debug DbServerCache unavailable, will call db...
1605878005 [140090933802816] DEBUG dserver/Debug/debug calling Tango::NotifdEventSupplier::create() 

1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceList
1605878005 [140090933802816] DEBUG dserver/Debug/debug Failed to import EventChannelFactory from the Device Server property file
1605878005 [140090933802816] DEBUG dserver/Debug/debug Notifd event will not be generated
Can't create notifd event supplier. Notifd event not available
1605878005 [140090933802816] DEBUG dserver/Debug/debug calling Tango::ZmqEventSupplier::create()
1605878005 [140090933802816] DEBUG dserver/Debug/debug Heartbeat thread Id = 4
1605878005 [140090933802816] DEBUG dserver/Debug/debug Tango object singleton constructed
1605878005 [140090601375488] DEBUG dserver/Debug/debug ----------> Time = 603878005,885049 Store sub device property data if needed!
1605878005 [140090601375488] DEBUG dserver/Debug/debug Entering tuning list. The list has 1 item(s)
1605878005 [140090601375488] DEBUG dserver/Debug/debug Sub device property storage, next wake_up at 603879805,883874
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class DServer is not defined in database
1605878005 [140090601375488] DEBUG dserver/Debug/debug Sleep for : 1799998
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Device name : dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug SubDevDiag::set_associated_device() entering ... 
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiAttribute class constructor for device dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug 0 attribute(s)
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving MultiAttribute class constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe::init_class_pipe
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving MultiClassPipe::init_class_pipe
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering end_pipe_config for device dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug 0 pipe(s)
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving end_pipe_config for device dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering DeviceImpl::init_logger
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering DeviceImpl::get_logger_i
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving DeviceImpl::get_logger_i
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving DeviceImpl::init_logger
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceAttributeProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving get_device_attribute_property
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceAttributeProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving get_device_attribute_property
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: FileDatabase destructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: FileDatabase constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering parse_res_file
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetDeviceProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug DServer::DSserver() create dserver dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class A is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class B is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class C is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class D is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class E is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug doc_url property for class F is not defined in database
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: ending DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassAttribute constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering MultiClassPipe constructor
1605878005 [140090933802816] DEBUG dserver/Debug/debug FILEDATABASE: entering DbGetClassProperty
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering DeviceImpl destructor for device dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving DeviceImpl destructor for device dserver/Debug/debug
1605878005 [140090933802816] DEBUG dserver/Debug/debug Entering DeviceClass destructor for class DServer
1605878005 [140090933802816] DEBUG dserver/Debug/debug Leaving DeviceClass destructor for class DServer
Traceback (most recent call last):
  File "six-limit.py", line 38, in <module>
    run(
  File "/opt/project/tango/server.py", line 1495, in run
    return server_run()
  File "/opt/project/tango/server.py", line 1355, in __server_run
    worker.run(tango_loop, wait=True)
  File "/opt/project/tango/green.py", line 109, in run
    return fn(*args, **kwargs)
  File "/opt/project/tango/server.py", line 1349, in tango_loop
    util.server_init()
  File "/opt/project/tango/globals.py", line 116, in class_factory
    device_class = device_class_class(tango_device_class_name)
  File "/opt/project/tango/server.py", line 413, in __init__
    DeviceClass.__init__(self, name)
  File "/opt/project/tango/device_class.py", line 269, in __DeviceClass__init__
    DeviceClass.__init_orig__(self, name)
RuntimeError: unidentifiable C++ exception
t-b commented 3 years ago

Thanks @ajoubertza. You see a line FILEDATABASE: entering DbGetClassProperty bu no line FILEDATABASE: ending DbGetClassProperty for the last device. So something in that function breaks. Here it is https://github.com/tango-controls/cppTango/blob/b6dd418e77f7bd6d40d40c89f86c200c8b76d4e3/cppapi/client/filedatabase.cpp#L1601.

Is it possible to build pytango against a debug version of cppTango? If yes how would I do that?

ajoubertza commented 3 years ago

@t-b Thanks.

Is it possible to build pytango against a debug version of cppTango? If yes how would I do that?

Yes. PyTango (under Linux at least) is dynamically linked against the libtango.so.9 file, so if you have a debug version of that, then it is already done. In general, to build the code: python setup.py build, if you have the source checked out. Alternatively, doing a pip install will build it as well: python -m pip install pytango.

The conda installation of cpptango is a debug version, so I did a little more digging with gdb. It looks like line that causes the error is here: (*data_out)[index] = Tango::string_dup((*data_in)[i + 1].in());index++;

i has value 6, but the data_in sequence only has 7 elements, so accessing index 7 raises the C++ exception. The elements are: G (class name), doc_url, cvs_tag, cvs_location, AllowedAccessCmd, svn_tag, svn_location. Indexing/variable name problem.

t-b commented 3 years ago

I wild guess would be that i -> j in this line.

ajoubertza commented 3 years ago

I'm transferring this issue to cppTango. Would be good to have this fix backported to 9.3.x as well.

bourtemb commented 3 years ago

I wild guess would be that i -> j in this line.

At first sight, I would replace (*data_out)[index] = Tango::string_dup((*data_in)[i + 1].in());index++; with (*data_out)[index] = Tango::string_dup((*data_in)[j].in());index++;

as it is done in FileDatabase::DbGetDeviceProperty so i+1 -> j

But deeper investigations are required because there is no doxygen documentation in the code listing what these methods are taking as input and what they are returning. We need to refer to the equivalent commands in the Tango Database server...

bourtemb commented 3 years ago

If we refer to DbGetClassProperty command in the Tango Database server, here are the argin and argout expected arguments:

Argin description:
Str[0] = Tango class
Str[1] = Property name
Str[2] = Property name
Argout description:
Str[0] = Tango class
Str[1] = Property number
Str[2] = Property name
Str[3] = Property value number (array case)
Str[4] = Property value
Str[n] = Propery value (array case)
....
bourtemb commented 3 years ago

If I understand well the code, at the line https://github.com/tango-controls/cppTango/blob/ef0c2be4a3223cbdbcd9521fc66eb8d7363cca87/cppapi/client/filedatabase.cpp#L1647 , we are in the case where a property name provided as argin does not exist in the class provided in Argin[0].

If we refer to the equivalent code in the TangoDatabase device server, I would say that https://github.com/tango-controls/cppTango/blob/ef0c2be4a3223cbdbcd9521fc66eb8d7363cca87/cppapi/client/filedatabase.cpp#L1647 corresponds to https://github.com/tango-controls/TangoDatabase/blob/Database-Release-5.16/DataBase.cpp#L2853

So here we want to put the name of the property provided by the user that we didn't find in the specified class. So (*data_in)[j].in() in this case, which is the name we were comparing with the different class properties of the wanted class at line 1622 just above: https://github.com/tango-controls/cppTango/blob/ef0c2be4a3223cbdbcd9521fc66eb8d7363cca87/cppapi/client/filedatabase.cpp#L1622