Exposure as Transport - Githubissues

smar000 commented 3 years ago

Hi @zxdavb

Not an issue, but I'm the author of evogateway/listener - you may recall we had a brief chat a few months back in the automatedhome forum. I was wondering if you've had any further thoughts on exposing your library as a transport. No urgency!

Many thanks.

smar000 commented 3 years ago

Just pushed to master branch - you may want to use that for a while - or bleeding_edge - up to you.

Thanks for the headsup.

You're using a different mechanism to specify an allow list, but these notes will still be useful to read: https://github.com/zxdavb/evohome_cc/wiki/Tips-for-the-configuration.yaml-file#allow-list

Thanks, that's helpful.

On a seperate matter, I was trying out the async_send_cmd and got the following timeout:

Traceback (most recent call last):
  File "evorf.py", line 550, in mqtt_process_msg
    resp = asyncio.run(GWY.async_send_cmd(gw_cmd))
  File "/usr/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.7/dist-packages/evohome_rf-0.5.6-py3.7.egg/evohome_rf/__init__.py", line 308, in async_send_cmd
    return await self.msg_protocol.send_data(cmd, awaitable=awaitable, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/evohome_rf-0.5.6-py3.7.egg/evohome_rf/protocol.py", line 426, in send_data
    result = await awaitable(timeout=4)  # may: raise TimeoutError
  File "/usr/local/lib/python3.7/dist-packages/evohome_rf-0.5.6-py3.7.egg/evohome_rf/protocol.py", line 51, in getter
    raise TimeoutError
TimeoutError

As this is my first foray into asyncio, is the timeout due to the way I'm calling the coroutine?

zxdavb commented 3 years ago

YMMV with 3.7 - should be OK, but I'm developing on 3.8

The TimeoutError simply says that it didn't get the response is was hoping for - it can't wait forever, so only waited 4 seconds & then gave up:

result = await awaitable(timeout=4)

The awaitable() is this code (i.e. is getter(), earlier up in protocol.py:

        async def getter(timeout=None) -> Tuple:
            dt_expired = dt.now() + td(seconds=3 if timeout is None else timeout)
            while dt.now() < dt_expired:
                try:
                    self._result = self._queue.get_nowait()
                except Empty:
                    await asyncio.sleep(0.005)
                else:
                    return self._result
            raise TimeoutError

What parameters did you pass to:

self.msg_protocol.send_data(cmd, awaitable=awaitable, **kwargs)

Specifically, what was the cmd, and did you use any kwargs?

smar000 commented 3 years ago

What parameters did you pass to:
self.msg_protocol.send_data(cmd, awaitable=awaitable, **kwargs)
Specifically, what was the cmd, and did you use any kwargs?

cmd is RQ --- 18:000730 01:139901 --:------ 313F 001 00, and I did not pass any kwargs. The same cmd works perfectly on the synchronous send_cmd.

Noted on the python version you are developing on. I will switch to the same version shortly.

zxdavb commented 3 years ago

How you getting on - I'm making various changes - I hope not breaking too much for you.

smar000 commented 3 years ago

Getting on quite well actually - have a working system where I've replaced all my legacy encoding/decoding with your framework. I'll probably upload it to github in the next day or so for anyone that wants to play with it.

Really like what you've done, and look forward to seeing your changes each time you commit!

smar000 commented 3 years ago

BTW, did you get a chance to look at that logging.basicConfig in packet.py?

zxdavb commented 3 years ago

looking now

zxdavb commented 3 years ago

OK, I'll move it to my client.py - the obvious place.

Really like what you've done, and look forward to seeing your changes each time you commit!

If you have any more suggestions, just ask!

smar000 commented 3 years ago

If you have any more suggestions, just ask!

Thanks and will do!

zxdavb commented 3 years ago

@smar000 I have made some change to schema, etc - likely will break your code - sorry.

Of course, you have a PiPy package you're using - so stick with that, but there'll be many bugs / new features coming too - it just may be worth hanging on before you jump ship to a later version of eh package...

smar000 commented 3 years ago

@smar000 I have made some change to schema, etc - likely will break your code - sorry.

Of course, you have a PiPy package you're using - so stick with that, but there'll be many bugs / new features coming too - it just may be worth hanging on before you jump ship to a later version of eh package...

Thanks for the heads up. I will leave off installing newer builds for the moment then.

smar000 commented 3 years ago

Hi @zxdavb

Hope all is well at your end.

I've been working on updating my system to use your latest build and in particular getting my send commands integrated in. You have clearly been very busy with the whole framework :)

One question/request - is there a callback for when sending commands fail? I am seeing log messages such as the following, but was looking for same way to trap so that they can be published to the mqtt broker:

PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=1, want=RQ|01:139901|0418|00, tout=2021-05-19 16:29:07.773286: RE-SENT (1/3)
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=2, want=RQ|01:139901|0418|00, tout=2021-05-19 16:29:07.980920: RE-SENT (2/3)
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=3, want=RQ|01:139901|0418|00, tout=2021-05-19 16:29:08.391702: RE-SENT (3/3)
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=3, want=RQ|01:139901|0418|00, tout=2021-05-19 16:29:08.391702: EXPIRED

Thanks!

zxdavb commented 3 years ago

Welcome back!

In init.py, there is a send_cmd() that can wrap any Command into a callback, and an async_send_cmd() that can wrap any Command as an awaitable (by using a callback internally).

Let me know if they don't work.

smar000 commented 3 years ago

Thanks for the quick reply!

I am in fact using that very same send_cmd() with a callback. The callback appears to be only called when the send is successful (and this is working consistently), but doesn't appear to be called if the send is Expired as in the above log. Is this expected behaviour?

zxdavb commented 3 years ago

Is this expected behaviour?

No, if the callback expires, it should return False.

zxdavb commented 3 years ago

for (hdr, callback ) in self._callbacks.items():  # 1st, notify all expired callbacks
    if callback.get(EXPIRES, dt.max) < pkt._dtm:
        _LOGGER.error("MsgTransport._pkt_receiver(%s): Expired callback", hdr)
        callback[FUNC](False, *callback.get(ARGS, tuple()))

Let me know what Command you're calling? & I'll take a look.

smar000 commented 3 years ago

Is this expected behaviour?

No, if the callback expires, it should return False.

I haven't been seeing this. The callback doesn't seem to have been called on any of my expired commands that I could see.

Let me know what Command you're calling? & I'll take a look.

It isn't any specific Command, though more often with the more "complex" commands such as set_dhw_mode with an until. On occasion, some commands fail and have to be resent - I think it may be due to the distance between my arduino and the Controller (they are in different rooms). This doesn't always happen, but happens often enough that it would be helpful to know. In my earlier log paste, the expired command was actually a simple get_system_log_entry.

EDIT: I have just now tried sending the command {"command": "get_system_log_entry", "kwargs" : {"log_idx": 0}} a few times. Almost all of them called the callback with the correct log entry. However, after repeated send attempts, one of them failed and showed the PktProtocolQos.send_data re-send/expired error, but as before did not call the callback (which is a simple print statement showing the callback response print(f"=================> rp_callback: {msg}") :

2021-05-20 06:45:13 |   | MQTT                  ->                       |  |                 | Received MQTT message: {"command": "get_system_log_entry", "kwargs" : {"log_idx": 0}}
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=2, want=RQ|01:139901|0418|00, tout=2021-05-20 06:45:13.905698: RE-SENT (1/3)
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=3, want=RQ|01:139901|0418|00, tout=2021-05-20 06:45:14.754533: RE-SENT (2/3)
2021-05-20 06:45:14 |000| HGI 18:134436         -> CTL Controller        |RQ| system_fault    | REQUEST: log_idx: 00
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=3, want=RQ|01:139901|0418|00, tout=2021-05-20 06:45:15.197516: RE-SENT (3/3)
2021-05-20 06:45:14 |000| HGI 18:134436         -> CTL Controller        |RQ| system_fault    | REQUEST: log_idx: 00
PktProtocolQos.send_data(RQ|01:139901|0418|00): boff=3, want=RP|01:139901|0418|00, tout=2021-05-20 06:45:15.226843: EXPIRED
2021-05-20 06:45:22 |053| OTB OpenTherm Bridge  ->                       | I| opentherm_sync  | ticker: 37662

Hope this helps.

On a separate note, do you know if it is possible to get an actuator to report its state (3EF0) by sending it an appropriate RQ command?

zxdavb commented 3 years ago

OK, will have a look.

smar000 commented 3 years ago

A simple way to test is to send a command to a device that does not respond back, e.g, if I create a Command with the following arguments (code="3EF0", payload="00", dest_id="13:133904", verb="RQ"), this always resends/expires.

zxdavb commented 3 years ago

OK, I found the bugs (x2) - it works now, but is not ideal - tests only when a packet arrives - if that packet is 15 seconds after the expiry, you have to wait 15 seconds. Below, the packet expired at 21:36:19.195, but the notice wasn't sent until immeditaely before the next packet at 21:36:42.583:

2021-05-20T21:36:15.727873  # evofw3 0.7.0
21:36:15.757 || HGI:006402 | NUL:------ |  I | puzzle_packet    | 01763... || {'ramses_rf': 'v0.9.7'}
21:36:18.766 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:18.828 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=1, want=RQ|01:145039|0004|00, tout=2021-05-20 21:36:18.928404: RE-SENT (1/2)
21:36:18.850 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:18.961 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=2, want=RQ|01:145039|0004|00, tout=2021-05-20 21:36:19.161328: RE-SENT (2/2)
21:36:18.982 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:19.195 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=2, want=RP|01:145039|0004|00, tout=2021-05-20 21:36:19.183045: EXPIRED
21:36:42.583 MsgTransport._pkt_receiver(RP|01:145039|0004|00): Expired callback
Callback has expired
21:36:42.583 || STA:205645 |            |  I | temperature      | 00077D   || {'temperature': 19.17}

I will fix it presently.

zxdavb commented 3 years ago

Try 0.9.7 - only the above issue remains.

smar000 commented 3 years ago

Try 0.9.7 - only the above issue remains.

Working! Thanks again for the quick turn-around.

zxdavb commented 3 years ago

I have pushed an update to master that will make callbacks for failed Command much faster. Something like:


2021-05-20T21:36:15.727873  # evofw3 0.7.0
21:36:15.757 || HGI:006402 | NUL:------ |  I | puzzle_packet    | 01763... || {'ramses_rf': 'v0.9.7'}
21:36:18.766 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:18.828 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=1, want=RQ|01:145039|0004|00, tout=2021-05-20 21:36:18.928404: RE-SENT (1/2)
21:36:18.850 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:18.961 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=2, want=RQ|01:145039|0004|00, tout=2021-05-20 21:36:19.161328: RE-SENT (2/2)
21:36:18.982 || HGI:006402 | CTL:145039 | RQ | zone_name        | 0000     || {'zone_idx': '00'}
21:36:19.195 PktProtocolQos.send_data(RQ|01:145039|0004|00): boff=2, want=RP|01:145039|0004|00, tout=2021-05-20 21:36:19.183045: EXPIRED
21:36:19.195 PktProtocolQos.send_data(RQ|01:145039|0004|00): Expired callback
Callback has expired
21:36:42.583 || STA:205645 |            |  I | temperature      | 00077D   || {'temperature': 19.17}

smar000 commented 3 years ago

Hi again @zxdavb

One thing to clarify - is it still the recommendation that the schema should only contain the controller ID and allow_list, and leave everything else for auto-discovery?

zxdavb commented 3 years ago

Yes, Controller ID only, and an allow_list - there are a few exceptions:

OTB can't be discovered, only specified in schema or eavesdropped
some TRVs, 00:, can't be discovered (so, same as above)

Eavesdropping is not generally recommended, but is needed to confirm a schema.

smar000 commented 3 years ago

Yes, Controller ID only, and an allow_list - there are a few exceptions:

OTB can't be discovered, only specified in schema or eavesdropped

some TRVs, 00:, can't be discovered (so, same as above)

Eavesdropping is not generally recommended, but is needed to confirm a schema.

Ok thanks and noted.

zxdavb commented 3 years ago

Yes - may need allow_list off & eavesdropping on to get full schema - then make a schema, use an allow_list & turn off eavesdropping for ongoing, until any changes are made at some future time.

TBH, you can manage without a full schema in most circumstances, but an allow_list is virtually required.

I use the allow list in ramses_rf, but also in evohome_cc too - that it, you may want to consider using it in evohome_listener.

smar000 commented 3 years ago

Thanks and that makes sense. I will look to do it the same way.

TBH, you can manage without a full schema in most circumstances, but an allow_list is virtually required.

For my own understanding, the allow_list is required to prevent messages appearing from (a) neighbouring evohome setups and (b) messages getting garbled in reception thus appearing to be from incorrect device ids?

zxdavb commented 3 years ago

Yes (to both).

The big issue is that valid packets can - as you say - have corrupted (but still valid) devices addresses.

There are a lot of controls in place for this, but ultimately an allow list is required.

smar000 commented 3 years ago

Understood! Thanks.

smar000 commented 3 years ago

@zxdavb Looking at client.py, is it sufficient to pass the --discover argument in order to enable eavesdropping for the full schema?

Also when should we use the SCAN_DISC, SCAN_FULL, SCAN_HARDand SCAN_XXXX?

Thanks.

zxdavb commented 3 years ago

(not sure if I understand your question 100%, so here goes)

Also when should we use the SCAN_DISC, SCAN_FULL, SCAN_HARD and SCAN_XXXX?

These are used for probing new device types or for development - not for production - discovery is a separate thing, and is built into the library.

The client.py tool's main reason for being is to: a) perform ad-hoc stuff / learn about the protocol (e.g. SCAN_FULL), or to demonstrate how the library can be used (e.g. evohome-listener, or evohome_cc). I see you've pulled some ideas from client.py (which the idea), but you may also want to look at evohomecc\__init_\.py (although it is a little more obtuse).

All the CLI switches are a 'cheat' - all that matters is passing the config/schema/device_list kwargs to the library - you can get those parameters from any source you like - from CLI/yaml/json/etc. - or a config file in your case.

Generally, eavesdropping is to be avoided, especially without an allow_list (although it may be the only way to get your allow_list, but don't eavesdrop after that.)

I use voluptuous roughly the same way you use configparser, and so see my schema.py:

CONFIG_SCHEMA = vol.Schema(
    {
        vol.Optional(DISABLE_DISCOVERY, default=False): bool,
        vol.Optional(DISABLE_SENDING, default=False): bool,
        vol.Optional(ENABLE_EAVESDROP, default=False): bool,
        vol.Optional(REDUCE_PROCESSING, default=0): vol.All(
            int, vol.Range(min=0, max=DONT_CREATE_MESSAGES)
        ),
        vol.Optional(MAX_ZONES, default=DEFAULT_MAX_ZONES): vol.All(
            int, vol.Range(min=1, max=16)
        ),
        vol.Optional(USE_SCHEMA, default=True): vol.Any(None, bool),
        vol.Optional(ENFORCE_ALLOWLIST, default=None): vol.Any(None, bool),
        vol.Optional(ENFORCE_BLOCKLIST, default=None): vol.Any(None, bool),
        vol.Optional(USE_NAMES, default=None): vol.Any(None, bool),
        vol.Optional(EVOFW_FLAG, default=None): vol.Any(None, str),
    },
    extra=vol.ALLOW_EXTRA,  # TODO: remove for production
)

There is one thing that ramses_rf needs to do, and does not yet do well - that is periodically re-discover...

This would be most useful or OTBs (R8810As) - I will ad this soon enough.

HTH, ask again if not.

smar000 commented 3 years ago

Great, that clears up quite a bit. Will also have a look at your evohome_cc__init.py__.

Thanks again.

smar000 commented 3 years ago

Hi David

I noticed that zone names are available in .evo.params but not in the schema's zone collection. Should /can these param zone names also be loaded into the framework in some way at start up, to save having to wait for the system to populate itself from eavesdropping (which can take some time)?

zxdavb commented 3 years ago

I am happy with the split between schema (effectively static, unless bindings are destroyed/created), params (configurable), and status (variable, but read-only) - the only possible

I am happy to push in (impose) a (hand-crafted) schema, only because it is so static - doing so with params would be problematic, and inappropriate with status...

... however, the latter two can be managed with _get_state(), _clear_state() and _set_state() - have a look at those - although they are currently private methods, they will eventually become canon.

There are two bugs presently affecting _set_state():

some packets are being expired too early
packets are not being refreshed when they do expire

HTH

zxdavb commented 3 years ago

BTW, you make a good point - eavesdropping of schema if problematic & disabled by default, but eavesdropping of params/state is normal practice.

smar000 commented 3 years ago

however, the latter two can be managed with _get_state(), _clear_state() and _set_state() - have a look at those - although they are currently private methods, they will eventually become canon.

Thanks, will check these out. Noted on the bugs.

smar000 commented 3 years ago

but eavesdropping of params/state is normal practice.

Ok!

smar000 commented 3 years ago

Hi David

Thanks for all your help (and patience!) throughout the migration. I am almost there I think. One final (hopefully!) question for now WRT the allow_list. Is this supposed to only permit exactly those devices listed or will it automatically allow any "18:" HGI device types through?

I ask as I have a number of spurious messages purporting to be from HGI devices (e.g. I am seeing messages from the following HGI IDs now as I type: 18:000030, 18:000070, 18:007030, 18:056026, 18:136712; these are all in addition to my actual ID of 18:000730).

My allow_list (as shown at initialisation) appears to be correctly configured and active and ENFORCE_ALLOWLIST also set to true:

Using an allow_list: ['01:139901', '02:043392', '04:000868', '04:001048', '04:143922', '04:143924', '04:143926', '04:143928', '04:227936', '04:227952', '04:227954', '04:228032', '04:228042', '04:228044', '04:228048', '04:228094', '04:228096', '04:228104', '07:033469', '07:043555', '10:142568', '13:230917', '13:102710', '13:133904', '13:127198', '34:015243', '34:112193', '34:214769', '34:231315', '18:000730']

Anyway, thanks again!

smar000 commented 3 years ago

Sorry, yet another query/clarification. I noticed that zone/dhw mode "until" datetimes are formatted (in the helper.) without the T separating date and time.

Should the T be included as part of the isoformat (e.g. https://en.wikipedia.org/wiki/ISO_8601)?

smar000 commented 3 years ago

Hi again @zxdavb

One further query - I noticed that you have the default HGI device address in the consts.py file as 18:000730. However, the address of my nanoCUL appears to be 18:136712. I don't see this address in the eavesdropped schema. Is there any way to identify the nanoCUL by the framework?

e.g. some messages from my log file:

2021-06-02 07:19:17,441 [299] || HGI:136712         | OpenTherm Bridge   | RQ | opentherm_msg    | 00000... || {'msg_id': '0x03', 'msg_name': 'SlaveConfigFlags', 'msg_type': 'Read-Data', 'description': 'Slave configuration'}
2021-06-02 07:19:17,760 [299] || HGI:136712         | Controller         | RQ | zone_devices     | 0208     || {'zone_idx': '02', 'device_class': 'rad_actuators'}
2021-06-02 07:19:17,776 [299] || Controller         | HGI:136712         | RP | zone_devices     | 02080... || {'zone_idx': '02', 'device_class': 'rad_actuators', 'devices': ['04:228042']}
2021-06-02 07:19:17,856 [299] || HGI:136712         | DHW Relay          | RQ | tpi_params       | 00       || {}
2021-06-02 07:19:17,873 [299] || DHW Relay          | HGI:136712         | RP | tpi_params       | 00300... || {'cycle_rate': 12, 'min_on_time': 1.0, 'min_off_time': 0.0, '_unknown_0': '00', 'proportional_band_width': None, '_unknown_1': '01'}

zxdavb commented 3 years ago

Currently. 18: devices are never filtered - this causes issues, and needs to change.

Because you don't know the HGI80s address before you send a packet, you use 18:000730, and it 'translates' it to it's actual device address, 18:136712 in your case.

Is there any way to identify the nanoCUL by the framework?

This is planned.

Sorry, yet another query/clarification. I noticed that zone/dhw mode "until" datetimes are formatted (in the helper.) without the T separating date and time.

I am willing to take a view on this (the issue is whether it will be a breaking change), but is is kind of 'optional', see: https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat

smar000 commented 3 years ago

Currently. 18: devices are never filtered - this causes issues, and needs to change.

Because you don't know the HGI80s address before you send a packet, you use 18:000730, and it 'translates' it to it's actual device address, 18:136712 in your case.

Understood!

I am willing to take a view on this (the issue is whether it will be a breaking change), but is is kind of 'optional'

Ah ok, I wasn't aware that it was optional. It's not that much of a problem, but openHAB's mqtt binding requires the "T" out of the box. Without it, additional formatting steps need to be carried out which has to be repeated for each datetime item etc. For now, I've added a small conversion before publishing to the mqtt broker. Maybe make it optional via config If you do re-consider it in the future?

zxdavb commented 3 years ago

Maybe make it optional via config If you do re-consider it in the future?

Making it a config option is not useful - I'll change the format if I can.

smar000 commented 3 years ago

Making it a config option is not useful - I'll change the format if I can.

Thanks. As I said, not a significant issue from my side.

Anyway, my migration to your framework seems to be complete for now, and so I will close this thread. Sincere thanks once again for all your support throughout.

zxdavb / ramses_rf

Exposure as Transport #15