wpietri / sucks

Simple command-line script for the Ecovacs series of robot vacuums
GNU General Public License v3.0
280 stars 104 forks source link

Feature idea: status command/monitoring #6

Closed Zenconomy closed 6 years ago

Zenconomy commented 6 years ago

Being able to request the current status of the bot, or being able to listen for status updates from the bot (not sure how these commands are implemented in the api)

For example, I have an issue with the robot getting stuck, giving up and going into suspended mode, when all it would need to is back up or rotate a little bit to get unstuck. If I could detect it going into suspended mode while not in the charger (or even better, be alerted when it gets stuck), I could tell issue a back/rotate command to get it going again. Or just send me an alarm to let me know that the little fella got stuck.

wpietri commented 6 years ago

Listening for updates is a fine idea. I believe the VacBot object, which is the XMPP client, receives a variety of notices from the vacuum itself, and may be able to probe for more. I think turning on the debug flag will dump all received messages, even the ones that the bot doesn't currently handle.

As for actually handling them, I recommend looking at the code that currently receives battery updates, including handle_battery_report and the related register_handler call.

Zenconomy commented 6 years ago

Progress! I did some testing with deliberately getting the bot stuck and looking at the log.

The bot sends out errors which get passed on to the XMPP client. I've gotten these two errors by trying to halt the robot:

102 HostHang 103 WheelAbnormal

I assume 102 means the bot is hung up on something, and 103 means it doesn't like what the wheels are doing.

When things are back to normal the robot sends

100 NoError

For the purposes of my testing I edited init.py with the following:

        self.register_handler(Callback('error',
                                       MatchXPath('{jabber:client}iq/{com:ctl}query/{com:ctl}ctl[@td="error"]'),
                                       self.handle_error))
    def handle_error(self, iq):
        self.error = iq.find('{com:ctl}query/{com:ctl}ctl').get('error')
        self.error_no = iq.find('{com:ctl}query/{com:ctl}ctl').get('errno')
        logging.debug("*** error =" + self.error)
        logging.debug("*** error_no =" + self.error_no)
wpietri commented 6 years ago

Excellent! That strikes me as a step forward, so feel free to submit a pull request with just that.

Zenconomy commented 6 years ago

Pull request added!

Through further testing i found that the XMPP server stops broadcasting status and error messages to the client if it doesn't receive any within a few minutes... this means that if the robot gets stuck we won't get notified.

I can see in the debug log that the XMPP client sends "Whitespace Keepalive" messages, but these don't help.

If I repeat the latest command (such as the clean command) every 60 seconds, the server keeps broadcasting messages. I verified this by running:

sucks clean 1 clean 1 clean 1 clean 1 clean 1 clean 1 clean 1

I'm assuming there is a command that the app sends just to keep the connection alive (or maybe even request bot status), but I don't know what it is. We could either try to identify the command by analyzing app traffic, or simply have sucks repeat the latest issued command every 60 seconds.

wpietri commented 6 years ago

Fab! I'll make a couple small changes to the pull request and merge it.

Repeating the command will make it difficult for humans to interact with the vacuum by other means. When mine is running, for example, I might pause it so I can pick up things off the floor or untangle a cord from its brushes. I don't want it starting up arbitrarily after that.

So I'd suggest MITMing the XMPP traffic so you can see what the app is actually doing during the keepalive period.

wpietri commented 6 years ago

And as an aside, we should start thinking about how to start adding tests for this stuff. I put in the handling for things like battery updates just to eliminate the error messages that the XMPP library generates when receiving unknown messages. But if checking the status and then doing things with it is going to be a real feature, we should have some real tests for it.

Zenconomy commented 6 years ago

Alright I got xmpppeek up and running. For documentation purposes, I can inform that the XMPP server is dependent on country. msg-na.ecouser.net didn't work for me so I picked apart the app and found the relevant config file.

Canada has no explicit config so it reverts ot the default which is msg-ww.ecouser.net

This seems to only be relevant for reverse engineering, Sucks works with the North America server.

I found a number of status commands, including:

GetCleanState GetOnOff GetError GetBatteryInfo GetChargeState GetSched GetLifeSpan

The app also sends a ping command to the server a few times every minute. Maybe this is enough to keep getting updates without having to explicitly ask for them. I will experiment.

wpietri commented 6 years ago

Glad to hear it. As you learn things, feel free to update the protocol document. And it sounds like it might not be a bad idea to create a developing.md file that explains things new developers will want to know. Good luck with your experiments!

Zenconomy commented 6 years ago

Okay I've sent a pull request with pings implemented. I think it makes sense to have the VacBot object send pings regularly just like the Android and iOS apps do.

wpietri commented 6 years ago

Agreed, and merged! I had wondered why it stopped sending battery updates. The ping seems weird, in that XMPP can provide presence information. But it clearly works. Thanks!

Zenconomy commented 6 years ago

Yeah they obviously didn't go by the XMPP standard keepalive functionality!

How do you suggest to implement listening for errors from the XMPP server on the VacBot object? I'm thinking something like this (pseudocode)

Vacbot.register_error_handler(errorName, errorCode):
  # handle the error
wpietri commented 6 years ago

Good question. Let's talk more about the use case. It seems like we're talking about the library case here, allowing people like yourself to easily implement custom behavior. An obvious option is encouraging people to subclass the robot. E.g.:

class MyBot(sucks.VacBot):
    def alert_my_owner(self, text):
        pushbullet.send_app_notification(text)

    def handle_error(self, code, name):
        if (code=='102'):
            alert_my_owner("Robot  stuck!")

Another is, as you say, to let people give callbacks. I might do this one slightly differently:

def alert_my_owner( text):
    pushbullet.send_app_notification(text)

vacbot = VacBot(...)
vacbot.on_error('102', lambda: alert_my_owner("Robot stuck!"))
vacbot.connect_and_wait_until_ready()
vacbot.run(Clean(900))

Another option is to go more with a publish/subscribe thing, which might look like this:

class BotWrangler():
    def __init__(self):
        super().__init__()
        vacbot = VacBot(...)
        vacbot.add_listener(self.bot_event)

    def bot_event(type, *args):
        if (type=='error' and args[0] == '102'):
            self.alert_my_owner("Robot stuck!")
        else:
            log.debug("got bot event {} with params {}".format(type,args))

The first is straightforward and lets users do a lot, but has high coupling and can be brittle. The second works best for people using it as we expect in relatively simple ways. The third has the lowest coupling, and might be useful in big, event-driven systems, so it might be best for integrating with smart-home systems.

I personally don't care a ton; it's mainly about longer-term use.

At this point we start to get into the issues discussed in #3, so I'd like to add @OverloadUT to the discussion. Or perhaps we should move this to the mailing list?

Zenconomy commented 6 years ago

Yes, talking about the library case. I would prefer an event based approach so the second or third option. The third one looks more flexible and future proof to me!

OverloadUT commented 6 years ago

Yep, third looks great to me!

wpietri commented 6 years ago

Ok. Does one of you want to take a swing at listing out the events we are likely to want soon? Perhaps @OverloadUT has notions already given the his plans for integrating it with other things?

Zenconomy commented 6 years ago

Here are events I can think of:

Robot error - sent over XMPP and has an error_no. Known errors:

100 NoError (robot is ok, usually sent shortly after another error)
102 HostHang
103 WheelAbnormal
104 DownSensorAbnormal

HTTP auth errors

XMPP connection errors

State was changed (battery status updated, clean mode update, charge mode update)

Command was acknowledged by the robot (for example, if I issue the clean command, event fires when we have response from the robot that the command was received)

OverloadUT commented 6 years ago

For my purposes, I probably wouldn't even set up a listener for HTTP auth errors or XMPP errors. I would expect the library to just handle those the best they can, and raise errors or call a logger to log the issue if it's something the user might need to correct themselves.

For events that I would subscribe to, state is pretty much it! The Robot errors would also be helpful of course, if they don't end up represented in the state anyway (when the robot gets stuck, does it stay in the clean state with a raised error, or is there an error state?)

The other bit of data I would definitely want is the accessory lifespan indicators. I assume you need to request these specifically, so any events for these would fire after a call is made to poll for an update.

OverloadUT commented 6 years ago

Another thought: rather than "events", I would also be perfectly happy with sucks handling all of the parameters I want to watch (accessory life, current status, etc) a lot like it currently does, and using an observer pattern so I can simply know when a property has changed.

I think it just depends on how much of the logic and vacuum state you want to life inside sucks. I'd be happy with either.

Zenconomy commented 6 years ago

Yes for HTTP / XMPP errors I don't mean to pass them through but rather have the library handle them and raise relevant errors such as authentication failed, connection failed, connection timed out and similar.

There is no error state, the bot broadcasts an error code (for example indicating that it's stuck with 102 HostHang), automatically stops and then broadcasts error 100 NoError.

The robot's state is a little bit confusing since it has two independent state variables, one for the current cleaning task (CleanState) and one for the current charging state (ChargeState). It seems that sometimes the bot can have a cleaning task active while it's charging... I think if that is the case the cleaning will resume when it's sufficiently charged. But I have to do some more testing to be sure.

I've implemented XMPP API commands for requesting all state variables (charge, clean, battery level, accessory lifespan) in my version of VacBot, I'll submit a pull request for that.

I'll also update the protocol doc with my findings.

torbjornaxelsson commented 6 years ago

@wpietri I think we can close this one. We have VacBot.subscribe_to_ctls for listening to events and VacBotCommands for requesting status.