codetheweb / tuyapi

🌧 An easy-to-use API for devices that use Tuya's cloud services. Documentation: https://codetheweb.github.io/tuyapi.
MIT License
2.06k stars 339 forks source link

status causes socket to restart regularly #84

Closed glentakahashi closed 5 years ago

glentakahashi commented 5 years ago

I'm seeing some very strange behavior where calling .status() on a wifi socket causes the socket to reset itself (turn off and then on immediately). It doesn't happen every time .status is called, but I can very very regularly see this happen. I've done some digging but haven't been able to diagnose why it's happening yet.

Any ideas as to what's causing this and how to fix it?

clach04 commented 5 years ago

I've not seen this with any of my devices but this is the 2nd report I've seen of this behavior. I do have devices where I have to power cycle them occasionally (something like once every 2 months) as they get locked up and then freeze. These devices have not been controlled by this, nor https://github.com/clach04/python-tuya/ so whilst its possible we are sending the wrong payload, a bug in the device also seems possible.

@glentakahashi what device(s) is this happening on? Is it all device types or only a single device/type?

glentakahashi commented 5 years ago

@clach04 I'm seeing this happen on all my devices, which are a combination of https://www.amazon.com/gp/product/B079Q5W22B/ and https://www.amazon.com/gp/product/B07CVH7WPD/

Do I have to unpair devices from tuya ios app before pairing them with pytuya? I didn't do that before I set them up, and that was a random theory I had. I can't unpair them currently because they disappeared from my tuya app, but do you think it's worth trying to re-pair them with tuya app, unpairing them, and then re-pairing with tuya-cli?

clach04 commented 5 years ago

I don't have any Teckin so I can't compare :(

RE pairing. I've not used the new setup but I got the impression it was an either or situation. I'm still using the old android app (and Alexa skill) as I was able to extract the key(s) for use with this and https://github.com/clach04/python-tuya

glentakahashi commented 5 years ago

So I did some digging in wireshark and other things, and it appears the device does a full restart for some reason, where it loses connection to the wifi, reconnects, and then queries the tuya apis at mq.gw.tuyaus.com and a.tuyaus.com to check for updates and report state.

One thing I noticed is that one of my newer Teckin devices which I've never connected to the cloud ever doens't have this problem, but every other one does. There must be something strange going on with remembering some previous connection. My current theory is that the device restarts itself in cases where it believes it's gotten into a bad state, and triggering the status endpoint when it can't phone home causes this.

Given the inconsistency of the restarts (it doesn't happen every 20 second update interval), and the inconsistency of the tuya api performance, I'm thinking these are related

glentakahashi commented 5 years ago

Also I'm noticing my symptoms are different from https://github.com/clach04/python-tuya/issues/29, if I run that polling script i don't see the actual state change, i only see a connection timeouts. So it's definitely the socket deciding to restart itself but I'm not sure why I would only see this with python-tuya and not when i use the native app itself.

EDIT: Doing more wireshark captures (now properly intercepting all traffic), i honestly can't see anything out of the ordinary indicating why it would reset. It isn't pinging home to a tuya server to check state or anything, it literally get the status request and then instantly shuts down.

EDIT2: Honestly out of ideas at this point, might just have to revert to using the cloud setup :/

glentakahashi commented 5 years ago

Okay so spent a lot more time debugging, and at this point I believe it to just be low quality devices. Repeating the exact same TCP getStatus transmissions that the Tuya app itself does also causes the crashes that I was randomly seeing before. Scripting it, sometimes the crash happens after 5 calls to getStatus, others it can get through 100 at a time and not crash. Likely to do with some stackoverflow or out of memory issue I assume in the Teckin smart plugs themselves.

I believe I actually would/could see this in normal usage of the Tuya app but it would be much much rarer due to the fact that the Tuya app doesn't poll for status updates ever, and only does A) on brand new app open B) when conducting scenes, etc. and also relies on the wifi switch itself to report its status on changes directly to the cloud using MQTT.

So, for now this is due to just buggy hardware, but longer term a way to alleviate this is to 1) not poll for updates or reduce the frequency, and 2) implement MQTT listener in the home assistant plugin similar to https://github.com/codetheweb/tuyapi/issues/74

tixi commented 5 years ago

Hi. I got exactly the same strange behavior with this model (firmware 1.0.3): https://www.amazon.fr/gp/product/B07CMKLM59/ref=oh_aui_detailpage_o05_s00?ie=UTF8&psc=1

Few observations:

glentakahashi commented 5 years ago

good to know I'm not just crazy! One thing I did to alleviate the problem was to change this line: https://github.com/sean6541/tuya-homeassistant/blob/master/tuya.py#L103 to > 300 so it caches the status for longer. I have seen far fewer random reboots now although they still happen, and longer term I'm thinking about implementing the mqtt status updates

tixi commented 5 years ago

For testing, I change a bit the code to never close the socket. I only open it again when the socket was close by the plug. With this, i do not observe the behavior with on, off and status command. :-)

Since the tuya app keep the connection open, I suspect that opening and closing socket for each request is not a normal behavior which leads to this strange behavior.

Not directly related but maybe useful, I observe that most of the time the status is partially present in the encoded result of a set command.

glentakahashi commented 5 years ago

ooh, thats a good find! do you have that code hosted somewhere? I'd like to give it a try as well

tixi commented 5 years ago

I put the code here: https://github.com/tixi/python-tuya-experimental Mostly for testing purpose. Let me know for bugs you may find.

glentakahashi commented 5 years ago

FWIW I've been running @tixi 's code for a bit now and am no longer seeing constant restarts of sockets even with caching removed (changed cached time to 0)

codetheweb commented 5 years ago

@glentakahashi @NorthernMan54 is currently working on a PR to enabling keeping sockets open, that may help alleviate some of your issues.

codetheweb commented 5 years ago

Hey @glentakahashi I recently merged a PR that adds support for event-based code (made possible by the awesome work of @NorthernMan54 and @Apollon77), would appreciate it if you upgraded and tried it out.

glentakahashi commented 5 years ago

hey @codetheweb thanks for letting me know! I'm actually not using tuyapi directly, only using pytuya (actually @tixi 's fork of it).

tixi commented 5 years ago

hey @codetheweb thanks for letting me know! I'm actually not using tuyapi directly, only using pytuya (actually @tixi 's fork of it).

Thank you for using my code but have in mind that I will not maintain this branch. It is an experimental code just for testing if the strange behavior is due to the use of non persistent socket and it seems it is the case. But with this code, some security issues (some are already known) will not be fix.

I have developed plugins (plug and bulb) for domoticz: https://github.com/tixi/Domoticz-Tuya-SmartPlug-Plugin https://github.com/tixi/Domoticz-Tuya-SmartBulb-Plugin The persistent socket is manage directly by the domoticz framework.

glentakahashi commented 5 years ago

No worries, I know that wont' be supported, just using it for now until I can replace it with a more full-fledged solution

dr1 commented 5 years ago

I'm having similar issue with this model: https://www.amazon.com/gp/product/B07CPT91M9 , also having issue #103

Initial setup I tested quite a few times and it seemed like a go, then I got into it deeper, blocked the cloud servers on the router, setup rules in openHab etc. Then I noticed after switching the device 2 or 3ish times, it would crash and reboot. I can seemingly push the buttons on the unit all day and theres not an issue. I have not yet unfirewalled and tried the standard app out some more, but will report back after I try that out.

dr1 commented 5 years ago

So I opened the connection back up to the cloud on the router, and went back to using the 'Smart Life' app, just for testing. Works fine with no crashes/resets, so I dont think the hardware is bad. Will have to see if I'm using a version with the included PR mentioned above.

Apollon77 commented 5 years ago

But I also could imagine that the reason is that you block the cloud connection. The device always connects to the cloud to send data their too ... depending how "good" (or bad?) they did the error handling in the device code such crashes should happen or it could be intended reboots or such ...

dr1 commented 5 years ago

I was fairly sure I tested that, since I intended to, but not confident enough so I tried again. You may or may not be on to something with that. I tried using it while still blocked and it was not even letting me get 1 successful switch in without a reset. I had one outlet off/one on when first plugging it in, 15-20 tries later thats how it stayed... Unblocking it then continuing to try, it was letting me get more like 8 or 10 successful switches. I still managed to get it to reset a couple times during that.

codetheweb commented 5 years ago

Hey @dr1, mind emailing me a packet capture of the app activity along with your localKey? I'm trying to see if your issue is related to a seemly more widespread one where the uid and the devId are different, thus causing issues.

Apollon77 commented 5 years ago

@codetheweb strange ... which one is then the correct one?

codetheweb commented 5 years ago

Apparently I think both are needed for some devices.

Apollon77 commented 5 years ago

Yyeaayyy ;-(

jchulce commented 5 years ago

I have two plugs impacted by this issue. They blink on and off by themselves several times an hour. I use pytuya via tuya-homeassistant. The impacted plugs are https://www.amazon.com/EPICKA-WiFi-Smart-Plug-2-Pack/dp/B076HKHSSX/ "EPICKA WiFi Smart Plug Mini (2-Pack) - Wireless Smart Plug Socket Outlet, Compatible with Amazon Alexa and Google Assistant, No Hub Required, Remote Control Your Devices from Anywhere". These are marked with model WP1000 model: SM-PW702 FCCID:2AJ5F. I have many other tuya plugs with different brands that work just fine, it's just that one model having the issue for me. I started using https://github.com/tixi/python-tuya-experimental earlier this week and have had issues since. Seems that persistent connections makes the issue go away.

codetheweb commented 5 years ago

As it sounds like persistent connections fix this issue (which would make sense), I'm closing this for now.

If anyone is still seeing issues while using persistent connections, feel free to reopen this.