labgrid-project / labgrid

Embedded systems control library for development, testing and installation
https://labgrid.readthedocs.io/
Other
327 stars 164 forks source link

Multiple different power resources in one place #1340

Closed sessl3r closed 5 months ago

sessl3r commented 5 months ago

We see a issue when having one Place with multiple Power Resources which use different Resource types (in our case PDUDaemonPort and TasmotaPowerPort. Example config:

test:
  power1:
    cls: PDUDaemonPort
    host: hostname
    pdu: pduname
    index: 0
  power2:
    cls: TasmotaPowerPort
    host: hostname
    status_topic: 'stat/powerstrip/POWER2'
    power_topic: 'cmnd/powerstrip/POWER2'
    avail_topic: 'tele/powerstrip/LWT'

When using such a place with labgrid-client the output is as follows:

$ labgrid-client -d -p test power get --name power2
  DEBUG: Starting session with "ws://test:20408/ws", realm: "realm1"
  DEBUG: expanded remote resources for place test: [PDUDaemonPort(target=Target(name='test', env=None), name='power1', state=<BindingState.bound: 1>, avail=True, host='host', pdu='pdu', index=0), NetworkService(target=Target(name='test', env=None), name='NetworkSe
rvice', state=<BindingState.bound: 1>, avail=True, address='test.mle', username='test', password='test', port=22), TasmotaPowerPort(target=Target(name='test', env=None), name='power2', state=<BindingState.bound: 1>, avail=True, timeout=30.0, host='host', avail_topic='tele/powerstrip-1/LWT
', power_topic='cmnd/powerstrip-1/POWER2', status_topic='stat/powerstrip-1/POWER2'), NetworkQuartusUSBJTAG(target=Target(name='test', env=None), name='jtag-test', state=<BindingState.bound: 1>, avail=True, timeout=10.0, host='test', jtagd_password='none', jtagd_port='3122', jtagd_cmd='quartus-v23.4.
sh jtagd', device_name='Arrow-USB-Blaster', device_port='')]
Traceback (most recent call last):
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/remote/client.py", line 756, in power
    drv = target.get_driver("PowerProtocol", name=name)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 230, in get_driver
    return self._get_driver(cls, name=name, activate=activate)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 183, in _get_driver
    raise NoDriverFoundError(
labgrid.exceptions.NoDriverFoundError: no <class 'labgrid.protocol.powerprotocol.PowerProtocol'> driver named 'power2' found in Target(name='test', env=None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/remote/client.py", line 723, in _get_driver_or_new
    return target.get_driver(cls, name=name, activate=activate)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 230, in get_driver
    return self._get_driver(cls, name=name, activate=activate)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 183, in _get_driver
    raise NoDriverFoundError(
labgrid.exceptions.NoDriverFoundError: no <class 'labgrid.driver.powerdriver.PDUDaemonDriver'> driver named 'power2' found in Target(name='test', env=None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/remote/client.py", line 2058, in main
    args.func(session)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/remote/client.py", line 766, in power
    drv = self._get_driver_or_new(target, "PDUDaemonDriver", name=name)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/remote/client.py", line 737, in _get_driver_or_new
    drv = cls(target, name=name)
  File "<attrs generated init labgrid.driver.powerdriver.PDUDaemonDriver>", line 9, in __init__
    self.__attrs_post_init__()
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/driver/powerdriver.py", line 380, in __attrs_post_init__
    super().__attrs_post_init__()
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/driver/common.py", line 24, in __attrs_post_init__
    super().__attrs_post_init__()
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/binding.py", line 55, in __attrs_post_init__
    target.bind(self)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 423, in bind
    return self.bind_driver(bindable)
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 372, in bind_driver
    raise errors[0]
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 358, in bind_driver
    self.get_resource(requirement, name=supplier_name, wait_avail=False),
  File "$$$/g/venv/lib/python3.10/site-packages/labgrid/target.py", line 142, in get_resource
    raise NoResourceFoundError(
labgrid.exceptions.NoResourceFoundError: no <class 'labgrid.resource.power.PDUDaemonPort'> resource named 'power2' found in Target(name='test', env=None), matching resources with other names: ['power1']
This may be caused by disconnected exporter or wrong match entries.
You can use the 'show' command to review all matching resources.

We tackled this down to _get_driver_or_new in the client which raises. Just hacking in the following diff solves the problem:

diff --git a/labgrid/remote/client.py b/labgrid/remote/client.py
index f9a444e..e889732 100755
--- a/labgrid/remote/client.py
+++ b/labgrid/remote/client.py
@@ -733,7 +733,10 @@ class ClientSession(ApplicationSession):
                 except ValueError:
                     raise NotImplementedError("Multiple bindings not implemented for named resources")

-            drv = cls(target, name=name)
+            try:
+                drv = cls(target, name=name)
+            except:
+                return None
             if activate:
                 target.activate(drv)
             return drv
Emantor commented 5 months ago

Please post the output of labgrid-client show for the place you are using.

sessl3r commented 5 months ago

Some for me obvious comments: In all the printouts I have replaced place-, hostnames user, passwords etc. as I have not created a minimal place. Please also ignore the JTAG Resource - this is not a upstream resource.

For me the impression is that in client.py it goes through all resources of the target and first finds the PDUDaemonPort resource which it sees is a power resource. However it raises an exception as the name does not match instead of continuing on with the other resources. In this case the TasmotaPowerPort with the correct name.

$ labgrid-client -p test show
Place 'test':
  matches:
    */test/*
  acquired: host/tobias
  acquired resources:
    otherhost/test/NetworkQuartusUSBJTAG/jtag-board
    otherhost/test/NetworkService/NetworkService
    otherhost/test/PDUDaemonPort/power1
    otherhost/test/TasmotaPowerPort/power2
  created: 2024-03-13 07:50:49.199185
  changed: 2024-03-13 09:32:48.789115
Acquired resource 'power1' (otherhost/test/PDUDaemonPort/power1):
  {'acquired': 'test',
   'avail': True,
   'cls': 'PDUDaemonPort',
   'params': {'host': 'broker', 'index': 0, 'pdu': 'power1'}}
Acquired resource 'NetworkService' (otherhost/test/NetworkService/NetworkService):
  {'acquired': 'test',
   'avail': True,
   'cls': 'NetworkService',
   'params': {'address': 'hostname',
              'extra': {'proxy': 'otherhost', 'proxy_required': False},
              'password': 'pass',
              'username': 'user'}}
Acquired resource 'power2' (otherhost/test/TasmotaPowerPort/power2):
  {'acquired': 'test',
   'avail': True,
   'cls': 'TasmotaPowerPort',
   'params': {'avail_topic': 'tele/powerstrip-1/LWT',
              'host': 'broker',
              'power_topic': 'cmnd/powerstrip-1/POWER2',
              'status_topic': 'stat/powerstrip-1/POWER2'}}
Acquired resource 'jtag-board' (otherhost/test/NetworkQuartusUSBJTAG/jtag-board):
  {'acquired': None,
   'avail': True,
   'cls': 'NetworkQuartusUSBJTAG',
   'params': {'device_name': 'Arrow-USB-Blaster',
              'device_port': '',
              'extra': {'proxy': 'otherhost', 'proxy_required': False},
              'host': 'otherhost',
              'jtagd_cmd': 'quartus-v23.4.sh jtagd',
              'jtagd_password': '1234',
              'jtagd_port': '3122'}}
Emantor commented 5 months ago

Please use the add-named-match command to setup different names between the two resources. You can also name one of the resources "default", this resource will be picked when no explicit name is provided.

Bastian-Krause commented 5 months ago

See also https://labgrid.readthedocs.io/en/latest/man/client.html#adding-named-resources

sessl3r commented 5 months ago

Hm I don't get it. Why should this be needed when the exporter already adds names to those. The identical solution is working eg. for consoles. When having multiple we name them in exporter config and access the in client using --name option.

From the show command it can also be seen the resources are named correctly:

  acquired resources:
    otherhost/test/NetworkQuartusUSBJTAG/jtag-board
    otherhost/test/NetworkService/NetworkService
    otherhost/test/PDUDaemonPort/power1
    otherhost/test/TasmotaPowerPort/power2

Also it is not a problem that Labgrid picks one of those - it will just never pick the second one (TasmotaPowerPort):

labgrid-client -p test power get # this will return status of power1 = PDUDaemonPort
labgrid-client -p test power get --name power1 # this will return status of power1 = PDUDaemonPort
labgrid-client -p test power get --name power2 # this raises the described Exception
Emantor commented 5 months ago

Which version of labgrid are you running?

sessl3r commented 5 months ago

v23.0.5 with some custom resources/drivers (jtag). Can do a 23.0.5 clean test hopefully next week if I find some time due to upcoming EW24...

sessl3r commented 5 months ago

Retested with plain v23.0.5 for exporter and client. Same behaviour. It is important to have different driver types for both power resources. If the same driver is in use for both resources everything works as expected - which makes sense when looking on my workaround.

Emantor commented 5 months ago

I tried to reproduce on master, I think the reason this is broken on stable is that a backport of ac560620ba4c4d4ab276a551ae623c2707af1b5a is missing. Can you test on master?

sessl3r commented 5 months ago

Yes. On master the issue seems to be solved. Great, so I will just build up onto master instead of stable. Thanks a lot.