kbr / fritzconnection

Python-Tool to communicate with the AVM Fritz!Box by the TR-064 protocol and the AHA-HTTP-Interface
MIT License
304 stars 59 forks source link

Add FritzCallMonitor class to fritzconnection #76

Closed springstan closed 2 years ago

springstan commented 3 years ago

While working on https://github.com/home-assistant/core/pull/40736 I was asked to extract this class (here) that uses the FritzBox call monitor feature.

The Fritz!Box call monitor feature can be activated by calling #96*5* on any phone connected to it. FYI Fritz!Box call monitor feature can be disabled by calling #96*4* on any phone connected to it.

FritzCallMonitor is a class that connects to any Fritz!Box via a given IP address and the given TCP Port for the call monitor feature (default is 1012). Furthermore, a callback function can be given that is called if the state of the call monitor changes (started, finished, incoming and outgoing calls). This is the basis for the AVM FRITZ!Box Call Monitor integration in Home Assistant which uses this information to update the state and attributes of a sensor entity.

# Example usage:

class MyClass:
    def my_callback(self, data):
        print(data)

my_class = MyClass()

monitor = FritzBoxCallMonitor(address="192.168.178.1", callback=my_class.my_callback)
monitor.connect()

This is the original code for the class that should be extracted into the used Python package, because it establishes and communicates with a device according to Home Assistant development checklist.

kbr commented 3 years ago

That's an interesting pull request. At first it is not based on fritzconnection, but can enhance the library offering an additional way to communicate with another service on the same device.

Browsing through the code I'm not sure that the socket communication has a stable implementation: so far the code is hoping to receive the (line based?) response in a single chunk. If this is not the case the parsing may not work as expected risking a malfunction elsewhere. Also in case the ThreadingEvent is set to True there is no explicit clean up of the resources. Furthermore the socket seems to be in blocking mode. Why is there a timeout for the socket? For me this implementation may run most of the time but seems to be broken.

Is it intention that the callback runs in the same thread as the socket connection?

Well, these are my first thoughts. May be I'm wrong on one or the other because I don't know the larger project.

springstan commented 3 years ago

At first it is not based on fritzconnection, but can enhance the library offering an additional way to communicate with another service on the same device.

Thank you for your fast reply and considering to include this in fritzconnection. You are absolutely right, that is not based on fritzconnection, but I thought this feature would be beneficial for this project.

Browsing through the code I'm not sure that the socket communication has a stable implementation:

First of I am not the original author of this code. After some quick research I have found in the change notes of this blog post from 2016 that @DavidMStraub is the original author for this Home Assistant integration. So far it has been running very stable as far as I can tell.

so far the code is hoping to receive the (line based?) response in a single chunk. If this is not the case the parsing may not work as expected risking a malfunction elsewhere.

As I said before it seems to be stable for the last 4 years with only 4 opened issues, but of course if you have any suggestions on how to make it better I will include them.

Also in case the ThreadingEvent is set to True there is no explicit clean up of the resources.

The clean up is currently done outside of this class here. Should this be included in the class as well?

Furthermore the socket seems to be in blocking mode. Why is there a timeout for the socket?

I will have to look into that, because I do not know the answer to these questions yet. Maybe @DavidMStraub still remembers.

kbr commented 3 years ago

Have had a deeper look into the code and the timeout makes sense. Including FritzCallMonitor to the library may be confusing, because it does something else. However a monitor class can be part of the core package because the core is about the direct communication with the device. Implementing a service to handle the socket connection by providing a thread-save Queue with the Fritz!Box event data may be a useful feature.

DavidMStraub commented 3 years ago

Interesting!

Yes, I hacked that together a long time ago; I vaguely remember opening an issue when this was still on bitbucket because of an issue with a PyPI release. But I don't claim that I knew what I was doing when it comes to the details :hand_over_mouth:

That being said, I've been using this Home Assistant integration for more than two years to trigger automations on incoming phone calls, and I don't remember it failing me once - which is quite remarkable (surprising, actually!).

Though I wasn't involved in the decision to move this out of Home Assistant, it certainly makes sense as the trend over there is to move all the integration-specific logic to external libraries.

The main question, I guess, is whether you @kbr would be happy with your library having a monitor class that can be used directly by HA or whether you would prefer this to be in a separate library layering in between yours and Home Assistant (this is in fact done for several Home Assistant integrations, too).

In both cases, what I've hacked together can certainly be improved!

kbr commented 3 years ago

@DavidMStraub I have no objection against integration of a monitor class in fritzconnection, because it is – well – a connection to the Fritz!Box :-)

Actually in a separate branch I'm testing a fritzconnection.core.fritzmonitor module so that an import like from fritzconnection.core.fritzmonitor import FritzMonitor may be available in future versions.

Based on the code of this pull request the idea is to provide a service that returns a Queue with the response strings from the call-events reported from the box. Another service (in a separate thread) can use this Queue for event-based handling. So it is also not restricted to a single callback, but can use multiple once – and it is a clean separation of the services. From my point of view a separation can be an advantage, because doing socket programming right can be hard, especially in combination with threading, what doing right is also hard on its own.

By the way, the 0.x versions of fritzconnection are also a bit hacky – until end of last year at a local sprint we decided to move the project to Python 3 leaving Python 2 and lxml behind, what ended up in a complete rewrite.

springstan commented 3 years ago

I have done some more research into this to see if there is any official documentation or a better way to implement socket/threading programming:

So what should be the next steps in this PR? What can I do to improve the code to add this code to fritzconnection whether it is in lib or core itself?

DavidMStraub commented 3 years ago

@DavidMStraub even commented and created a PR in 2016

LOL, I do not remember that at all.

bufemc commented 3 years ago

Sorry, I HAVE TO THROW MY HAT IN HERE ;-) "Eigenlob stinkt" you say in Germany.. but I guess "the best" or say so most stable Fritzbox call monitor around is mine.. it's working on a Raspberry Pi4 since months.. it's a part of my Fritzbox tools set at https://github.com/bufemc/a1fbox - but you might be interested only in:

https://github.com/bufemc/a1fbox/blob/master/a1fbox/callmonitor.py

Why I "exaggerate" that much? Because it took me several rounds to make it really stable, it even survives network outages. As said, runs since months to do call blocking.

Many source codes around just provide some very basic setup for the call monitoring, which will immediately fail if the "socket goes dead" (see link below, it's very common, many people write about it in forums) or the network is unreachable..

kbr commented 3 years ago

@bufemc : springstan already mentions your repo. I'm busy the next fortnight, but then I may be able to care about the topic again.

springstan commented 3 years ago

Hello @kbr, I was wondering if you had time to take a look at https://github.com/bufemc/a1fbox or work on this topic in general?

kbr commented 3 years ago

@springstan : yes, I did. Do you have anything special in mind?

springstan commented 3 years ago

Was just wondering what the current status is and if some of af1box's code can be used here in fritzconnection.

kbr commented 3 years ago

The a1fbox code uses a file-like approach for socket reading that does not work well with timeouts – therefore all the additional platform dependent socket-options to keep the connection. These options can easily added to fritzconnection in case of need. But I decided to use a timeout-based approach (like @DavidMStraub) with adding a buffer, an inter-thread communication queue and a lot of tests.

It's working well so far. Just use it and we will see if there are hidden flaws. That's the current status.

springstan commented 3 years ago

That sounds great! Thanks for the update.

It's working well so far. Just use it and we will see if there are hidden flaws. That's the current status.

Can you estimate when this feature will be released? I would also love to test the code as well, where can I find it?

kbr commented 3 years ago

Just install the latest version (1.4.0), i.e. from PyPi. It's also covert by the documentation, but there may be room for improvement :)

bufemc commented 3 years ago

"therefore all the additional platform dependent socket-options to keep the connection"

Just for explanation: I tried several approaches and here is the issue, why I wrote so much about it in the README:

I might test your approach in the next few days if it really survives this 2-3 hours interval..

kbr commented 3 years ago

@bufemc: I'm on a Mac (OS 10.14) and the monitor is up and running since more than 60 h and still is reactive, even with hours of OS idle state or phone idle state (or both of course). It's not the first time I'm with sockets, but it is very appreciated if you test it on windows or linux and can give some feedback or hints for improvement. It's easy to add more socket options but I would like to avoid this as long as possible to keep the code (hopefully) platform agnostic. My remarks have been no critic on your code as it is just another approach to keep things running.

bufemc commented 3 years ago

Ah, time for a deal: you could checkout my a1fbox, adapt the config.py and run my so called " example2.py " (or directly the callmonitor.py) on your Mac, whether it keeps running and signaling RING events (I think they are printed) after about 3 hours..

And I could run your approach on Windows (10, 64bit) and Linux (Raspberry Pi 4 Raspberry OS 32bit) ,-)

bufemc commented 3 years ago

Okay, it's running here on Windows AND Linux (the latter was complaining about lxml etc - but could solve it), since 16:05, ringing showed the event, let me try again in 3 hours.. about after the news at 19:20..

One small issue: in the example on https://fritzconnection.readthedocs.io/en/1.4.0/sources/call_monitoring.html it's missing on the top

import queue

bufemc commented 3 years ago

Hm, es stellt sich mal wieder Merkwuerdiges heraus:

Es liefen 3 Dinge gleichzeitig und die letzten Signale waren:

Nur zwei haben den RING nach 3 Stunden noch signalisiert, fritzconn auf Win und a1fbox auf Lin. fritzconn auf Lin bleibt die letzten Events schuldig.

Die ersten RINGs, also direkt nach Script-Start haben alle 3 Scripts signalisiert, es liegt also nicht etwa daran dass die sich gegenseitig was abfischen..

Uh, I should write in English, too: I tested the fritzconnection (fc) implementation on Windows and Linux, plus my a1fbox implementation on Linux. My a1fbox on Linux and fritzconn on Windows still signal the latest events (however, it took some extra time until fc printed it?)

I am a little bit surprised too, I would have thought that either both fc scripts don't signal, or only fc on Windows would not work anymore, now it's the opposite. I will run these tests tomorrow again (maybe also with a1fbox on Windows), plus maybe try to add in a fork some kind of timeout handling to see if it solves also "dead end" on Linux.

Update 1: post corrected, first it looked like both fc scripts do not signal, but the one on Windows just took somehow longer to print it..

Update 2: Re-running the test today without changes in your code, this time a1fbox on Lin+Win, fc on Lin+Win

kbr commented 3 years ago

Thank you for testing and the typo hint.

May be that there was some delay because of the 60 sec timeout for reconnecting. Also there is a change for missing events after connection losses.

I've made a commit to the master branch so reconnect starts now with a very short delay and increases the delay between reconnections in case of consecutive failures.

bufemc commented 3 years ago

Believe me or not, but today both fc (NOT with your latest changes) signalled last time: 03.12.20 11:26:48;DISCONNECT;0;0; and both a1fbox signalled last time (parsed): date:03.12.20 time:16:42:16 type:DISCONNECT duration:3457

All 4 scripts are still running. I wonder if I abort one by one if the fc on Windows will again "give me a surprise" and print the events just later? But will wait until about 8pm and then ring again.. and then update this post with final result.

Small update 17:30: fc on Win has aborted with:

03.12.20 11:26:48;DISCONNECT;0;0;
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Python3\lib\threading.py", line 950, in _bootstrap_inner
    self.run()
  File "C:\Python3\lib\threading.py", line 888, in run
    self._target(*self._args, **self._kwargs)
  File "C:\workspace\python\fbox\fritzconnection\fritzconnection\core\fritzmonitor.py", line 214, in _monitor
    raw_data = sock.recv(FRITZ_MONITOR_CHUNK_SIZE)
ConnectionResetError: [WinError 10054] Eine vorhandene Verbindung wurde vom Remotehost geschlossen
Error: fritzmonitor connection failed

I will restart it, ring and then re-ring at about 20:30.. btw for this ring the fc on Linux still don't signal.. I would tend to say that fc on Linux really has a problem. But let's wait for 20:30..

Bad, it also happened (I guess at ~18:00 or later) now for fc on Linux, too:

03.12.20 11:26:48;DISCONNECT;0;0;
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pi/pi4samba/fritzconnection/fritzconnection/core/fritzmonitor.py", line 214, in _monitor
    raw_data = sock.recv(FRITZ_MONITOR_CHUNK_SIZE)
ConnectionResetError: [Errno 104] Connection reset by peer

Error: fritzmonitor connection failed

Keep in mind this time both DID NOT SIGNAL the last ring events here.. this is different from yesterday.

So I will restart fc on Linux, too, ring.. and then re-ring after about 3 hours.. now it gets about 23:00..

kbr commented 3 years ago

This traceback is valuable information and can be addressed. Never got this on a mac.

bufemc commented 3 years ago

I have to give up earlier than thought.

FC on Windows did it again..

03.12.20 17:28:45;DISCONNECT;0;0;
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Python3\lib\threading.py", line 950, in _bootstrap_inner
    self.run()
  File "C:\Python3\lib\threading.py", line 888, in run
    self._target(*self._args, **self._kwargs)
  File "C:\workspace\python\fbox\fritzconnection\fritzconnection\core\fritzmonitor.py", line 214, in _monitor
    raw_data = sock.recv(FRITZ_MONITOR_CHUNK_SIZE)
ConnectionResetError: [WinError 10054] Eine vorhandene Verbindung wurde vom Remotehost geschlossen
Error: fritzmonitor connection failed

FC on Linux still runs, but shows as last event:

03.12.20 20:21:37;DISCONNECT;0;0;

a1fbox on Win+Lin still runs, and shows as last event (parsed):

date:03.12.20 time:21:58:15 type:DISCONNECT duration:503

In short: this time (could it be because 4 scripts monitored the FB?) the FC scripts quit, and FC on Linux did not deliver the last events. Interestingly (for me at least) even before the 2 hours passed since the last event (21:58 - 20:21).

This time I win the washing machine.. okay, kidding, please don't send..

springstan commented 3 years ago

With https://github.com/kbr/fritzconnection/pull/76/commits/9e7942958c6b60e78d830d2187bbe1e89b0f3241 I have just added two parameters to the start() method of FritzMonitor. FritzMonitor is fine as it is for general use cases but for Home Assistant these two callbacks are needed to make the usage much more flexible. Otherwise I need to create another Thread for processing events which loops indefinitely and calls another function to extract the relevant event data.

Pros:

How I have tested it my additions:

import time
from fritzconnection.core.fritzmonitor import FritzMonitor

class MyClass(object):

    def __init__(self):
        self.monitor = FritzMonitor(address='192.168.178.1')
        self.monitor.start(callback_event=MyClass.printer, callback_reconnect=self.reconnecter)

    @classmethod
    def printer(cls, response):
        print("Call response: " + response.strip())

    def reconnecter(self, sock):
        loopactive = True
        while loopactive:
            success = self.monitor._get_connected_socket(sock)

            if success:
                print("RECONNECTED")
                loopactive = False
            else:
                print("Waiting for reconnection...")
                time.sleep(60)

def main():
    """Entry point: example to use FritzMonitor.
    """
    myobj = MyClass()

if __name__ == "__main__":
    main()

By adding these two parameters I have tried to adjust as few lines of code as possible to ensure that it works with or without supplying these two parameters. @kbr what do you think? I am open to any feedback from your side since it is your library.

kbr commented 3 years ago

@springstan you are right that the present implementation may require another thread for event-handling. That's because fritzmonitor.py encapsulates the complicated stuff of dealing with sockets and threads providing a thread-save queue for event-polling. So an event handler does not share the thread for polling the events from the router. That's a clean separation.

If you need a different api, I would recommend to write an adapter class. This is also much more flexible because you can provide a lot of different callbacks for different events and this approach can lead to a more modular and better testable code.

springstan commented 3 years ago

If you need a different api, I would recommend to write an adapter class. This is also much more flexible because you can provide a lot of different callbacks for different events and this approach can lead to a more modular and better testable code.

Thanks for the recommendation @kbr! However I am not quite sure how the implementation of such an adapter class would look like or work. Especially adapting from an event queue to executing callback functions. Could you give me a hint into the right direction?

kbr commented 3 years ago

Writing an adapter class is basically wrapping a class with another class, where the wrapped class is an attribute of the wrapping class. The wrapping class provides the new api and does the translation (a bit like mapping an US power plug to UK or German norms).

Here is a very basic (and untested) example, partly based on the fritzmonitor cli code, but including the threading:

import threading
from fritzconnection.core.fritzmonitor import FritzMonitor

class FritzMonitorAdapter:

    def __init__(self, address, event_callback):
        self.monitor = FritzMonitor(address=address)
        self.event_callback = event_callback
        self.event_thread = None
        self.stop_flag = threading.Event()

    def start(self):
        event_queue = self.monitor.start()
        self.event_thread = threading.Thread(
            target=self._handle_events, 
            kwargs = {"event_queue": event_queue}
        )
        self.stop_flag.clear()
        self.event_thread.start()

    def stop(self):
        self.monitor.stop()
        self.stop_flag.set()
        self.event_thread.join()

    def _handle_events(self, event_queue, timeout=2):
        while not self.stop_flag.is_set():
            try:
                event = event_queue.get(timeout=timeout)
            except queue.Empty:
                # check health:
                if not self.monitor.is_alive:
                    raise OSError("Error: fritzmonitor connection failed")
            else:
                # do event processing here:
                self.event_callback(event)

This way it is also possible to provide multiple callback-fuctions to call depending on the parsed event. Or to put this functionality into the event_callback callable itself. This adapter can be used like:

def the_event_handler(the_event):
    # this is the callback function
    ...

monitor = FritzMonitorAdapter(address, the_event_handler)
monitor.start()
springstan commented 3 years ago

Thanks a lot for helping me with this. This should work in principle to call specific call ack functions based on the occurring event. However, with this Adapter approach I am not sure if or how it could be possible to extend or change the reconnection logic. A checklist criteria for a Home Assistant integration is to log once when the connection is disrupted and once more when a new connection has been established successfully. Is that achievable with such an Adapter class?

kbr commented 3 years ago

The adapter using FritzMonitor will get an OSError on calling monitor.start() in case that a connection to the router fails. Also the adapter can check monitor.is_alive from time to time to get the information whether the monitor-thread is still alive. But this will not report about internal reconnections to keep the monitor running.

To report about internal reconnections one can inherit from FritzMonitor overriding the _monitor-method using the event_reporter to put corresponding events to the queue. That could also be an idea for an enhancement.

To change the way reconnections are done one can override _reconnect_socket or the _get_connected_socket as the latter is where i.e. @bufemc has implemented additional platform-dependent tcp socket options to keep the connection up. I'm considering to adapt this once the connection lost handling has been improved, because this can happen anyway.

Please keep in mind that all these methods to override are private methods.

bufemc commented 3 years ago

Just tell me if you finished it and need a tester for Windows and Linux again. Because I run into these timeouts that often in the beginning I tried to make it a fortress - okay, not the right wording here, but it runs 24h/7 now on a Raspi 4, "cooled by Lego" ;)

chemelli74 commented 2 years ago

Hi, what's the status of this PR ? Seems already merged.

Simone

kbr commented 2 years ago

Well, meanwhile a FritzMonitor class is implemented but it is unclear whether to be more platform specific about reconnections (resp. holding the line); something I would like to avoid. But as there are no complains so far I like to close this PR. Feel free to reopen in case of need.