jasonacox / tinytuya

Python API for Tuya WiFi smart devices using a direct local area network (LAN) connection or the cloud (TuyaCloud API).
MIT License
999 stars 177 forks source link

Add fetching device logs to Cloud, and generic URLs #219

Closed uzlonewolf closed 1 year ago

uzlonewolf commented 1 year ago

In addition, I reworked the error handling a bit. I usually use tinytuya.json to pass the cloud login info, but I forgot to copy it in after creating a new branch and Cloud.__init__() immediately blew up with TypeError: __init__() should return None, not 'dict' because a dict cannot be returned by __init__. I wasn't sure if just setting the new .error variable would be sufficient, so I currently have it throwing a more descriptive error. This also mimics existing behavior better.

._gettoken() now sets .error (and returns it) on error (such as bad auth or wrong region) instead of setting .token to an error_json object. I considered but ultimately decided not to set .error in the various get...() functions as a bad device_id or parameter isn't necessarily fatal, unlike a ._gettoken() error.

Included in the PR are 2 new functions, getdevicelog() and cloudrequest(). cloudrequest() just offers an easy way of fetching one of the many URLs listed in https://developer.tuya.com/en/docs/cloud/device-connection-service?id=Kb0b8geg6o761

Assuming your credentials are in tinytuya.json, you can pull the last 24 hours of logs for a device by simply:

import tinytuya
import json

c = tinytuya.Cloud()
r = c.getdevicelog( '00112233445566778899' )
print( json.dumps(r, indent=2) )

Tuya's servers hold one week worth of logs IIRC.

Inspired by #214

jasonacox commented 1 year ago

Brilliant! Thanks @uzlonewolf. The improved error handling and cleanup is also much appreciated. 🙏

I'll run a few tests and release this as a minor update. I think this Cloud API enhancement is worth getting out quickly. It may be good to include your code example in the Cloud section of the main README.

uzlonewolf commented 1 year ago

README updated in #220.

I'm also working on making sub-devices (such as Zigbee devices connected to a hub as in #31) work, hopefully it won't be too long.

jasonacox commented 1 year ago

PR changes now live in v1.8.0.

bernd-wechner commented 1 year ago

Fantastic, thanks for this. Works a charm. Now just have to work out how to interpret those logs.

bernd-wechner commented 1 year ago

Best I've found is this:

https://developer.tuya.com/en/docs/cloud/cbea13f274?id=Kalmcohrembze

Alas it's a bit spartan. For example I get a lot of event ID 7 and the doc:

event_id Integer Type of event: 1 online, 2 offline, 3 device activation, 4 device reset, 6 firmware upgrade, 8 device semaphore, 9 device restart, 10 timing information

The event time is clearly encoded, but not documented how:

event_time Long Event Time

And I'm seeing events like:

{'event_from': '1', 'event_id': 1, 'event_time': 1669846757908, 'status': '1'}

and:

{'code': 'doorcontact_state', 'event_from': '1', 'event_id': 7, 'event_time': 1669846737501, 'status': '1', 'value': 'true'}

and trying to correlate them with what I see in their web interface. I'll plod along but just though to drop a line in case any gems of wisdom fall from the cloud to help me ;-)

uzlonewolf commented 1 year ago

Yeah, the documentation leaves a lot to be desired. Higher up in the page it says 7=data report. The times are UNIX timestamp * 1000, so 1669846737501 = 1669846737.501 = 2022-11-30 14:18:57

bernd-wechner commented 1 year ago

This is great and am loving it! Making huge headway into ensuring all the data is in a local database ;-). But as I retire for the night, I'm stuck on one problem the docs don't seem to say much on. Suddenly, for the first time, I see the result contains:

'has_next': True,

and

'next_row_key': '<long key>

What to do with this?

Could/should getdevicelog handle this, simply fetching the next row until has_next is False?

uzlonewolf commented 1 year ago

Could/should getdevicelog handle this, simply fetching the next row until has_next is False?

That is a very good question. I think it depends on whether or not getdevicelog is returning all size= records; if it is, then it's the caller's problem, otherwise getdevicelog should probably do it.

I should probably add an argument to getdevicelog for that row key, but in the meantime getdevicelog( <same args as original call>, params={"last_row_key": "<long key from next_row_key>"}) should work.

How often are you pulling logs? An issue I didn't even think about when writing this is #230.

bernd-wechner commented 1 year ago

Ouch. Not clear to me how that limit works. Described in #230 as a flat limit and a rate limit?

The latter is viable. The former kills my plans and pushes me back to considering reflash options.

I want a lasting solution.

I've done a few fetches to test of course (10s thereof) and long term need only one fetch per day ideally (as long as it returns 24 hours of data)

uzlonewolf commented 1 year ago

The documentation is as clear as mud. One page says you get "about 26,000 API calls" but another says Cloud API calls are "Free of charge within monthly usage limits of $ 0.20". I'm leaning towards the limit is per month.

Even if it is not, how many devices do you have? 26k calls at 1/day means you have 71 years with 1 device or 7.1 years with 10 devices.

bernd-wechner commented 1 year ago

Only 4 door sensors. So it sounds like the limits will be fine, either way. Will check the multi row thing when next able (at the workstation).

bernd-wechner commented 1 year ago

I think it depends on whether or not getdevicelog is returning all size= records; if it is, then it's the caller's problem, otherwise getdevicelog should probably do it.

Looked into this quickly and doesn't look like TinyTuya has anywhere the logic to collect all the rows of a log. To evidence that:

bernd@bigfoot:~/workspace/IoT$ git clone https://github.com/jasonacox/tinytuya.git
Cloning into 'tinytuya'...
remote: Enumerating objects: 2181, done.
remote: Counting objects: 100% (11/11), done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 2181 (delta 4), reused 10 (delta 4), pack-reused 2170
Receiving objects: 100% (2181/2181), 1.04 MiB | 2.82 MiB/s, done.
Resolving deltas: 100% (1350/1350), done.
bernd@bigfoot:~/workspace/IoT$ grep -R has_next tinytuya
bernd@bigfoot:~/workspace/IoT$ grep -R next_row_key tinytuya

I shall try with adding params={"last_row_key": "<long key from next_row_key>"} to the args and fetch them one by one, and maybe write me a wrapper that does that (which just gets me the whole log). There have to be enough for has_next is true, which I've only seen once so far. But if I catch it, I'll spring to action and test this idea out.

uzlonewolf commented 1 year ago

There have to be enough for has_next is true, which I've only seen once so far. But if I catch it, I'll spring to action and test this idea out.

Yes, it should only happen if there are more than size= records for the given time range (default is size=100). You can force it by specifying a small number such as size=5.

uzlonewolf commented 1 year ago

Ok, I worked on this a bit this morning and discovered a few things.

It turns out https://developer.tuya.com/en/docs/cloud/0a30fc557f?id=Ka7kjybdo0jse is the documentation for the URL we're using; the one linked earlier in this thread is for a slightly different URL that we do not have access to (/v1.0/devices/ vs /v1.0/iot-03/devices/; the latter returns "msg": "No permissions. This API is not subscribed." when called). They're similar but the query parameters and returned values are slightly different.

As for the returned logs, there is a slight issue I forgot about: if the "official" DPS map is wrong, the DP's not in the official map are counted but filtered out by the server and not returned. This means it can set 'has_next': True while returning fewer results than requested.

Finally, the param to get the next set is start_row_key and should be set to what was returned in next_row_key.

I'll put together a PR that handles pulling the rest when 'has_next' is True and fewer results than requested are returned.

bernd-wechner commented 1 year ago

Thanks heaps. It's not a big code change and I'd happily have done it too, but it's hard to do unless you can see has_next as true to test (an untested code snippet, even small is pretty risky). And I've only seen it once to date.

uzlonewolf commented 1 year ago

Finally finished writing the loop. With it retrying to get everything, my biggest concern is it somehow getting stuck in a loop and chewing through all your monthly requests, so I have limited it to at most 50 tries. Combined with the server's hard limit of 100 results per try, this means you can pull at most 50100=5000 log messages per function call. For some extremely chatty devices, such as the thermostats I have, even that may not be enough to pull them all, so you might need to either also loop yourself or set the new max_fetches= parameter to something greater than 50. The size= parameter is now an approximate target and the actual number of logs returned will be somewhere between 0 and `size 2 - 1; setsize=0to get up tomax_fetches * 100entries. I also added astart_row_key=parameter while I was at it. The new function definition is getdevicelog(deviceid=None, start=None, end=None, evtype=None, size=0, start_row_key=None, max_fetches=50, params={})`

bernd-wechner commented 1 year ago

Interesting. Some thoughts:

  1. An endless loop would only be a risk surely if the next_row_key pointed back to an already fetched log (i.e. a bug at their server end). But that could be guarded against by keeping a set of already fetched row_keys and detecting such a loop.

  2. max_fetches is a sensible option yes, though I'd differentiate between fetching next rows and retrying a fetch that failed, or? I typically code up a max_retries parameter for retries (and 50 is generous of that, I usually settle on 5 to 10, but it depends on the reliability of the API I'm using).

  3. The date range should of course be honoured server side, and limit the number of rows, but can also, if keen, be checked client side. I observed for example that the logs seem to be bundled reverse chronology (most recent event first, the oldest event last), and so client side easy to check that the first event is at or before the requested end time and also stop fetching if a log ever contains a last event at or prior to the start time.

  4. For now, I am fetching the whole log and actually do that by setting the start=1, end=sys.maxsize as I'm testing manually, but I'll set up in time, a daily fetch in time for the last day each time. I might even sensibly write min to start at the time of the last fetched event to now on each run and assert that the last event in the returned log is one I've already seen. As I intend to record all these events in a database locally.

uzlonewolf commented 1 year ago

For 1. I agree it's unlikely, however with only a few thousand calls per month I didn't want to risk it. In addition to the limit of 50 it actually does also keep track of the last next_row_key and aborts immediately if it does not change.

With 2., it does not retry failures, it aborts immediately, so there's nothing to differentiate between. I figured any error is most likely fatal so I didn't see a point in retrying failures.

I originally had max_fetches as 101 (1 + 100 retries), however my thermostats chewed up all 10,100 log entries without reaching the end and I ultimately decided 50 was a more sensible limit to prevent unintended mass queries.

uzlonewolf commented 1 year ago

Oh yeah, as for 3. I'm again going to point to my thermostats. Unfortunately the official DPS map for them isn't even close and 70% of the DP's they use are not listed (and are thus filtered out from the server response). In my testing I've seen gaps of 300+ log entries (3+ fetches) where the returned logs array is completely empty and the only clue it's not stuck is the next_row_key changing. As such, timestamp checking on the client side won't be reliable for devices like these.

bernd-wechner commented 1 year ago

Remind me what's a DP?

uzlonewolf commented 1 year ago

Data Points (DP/DPS). It's the numeric ID # Tuya devices use to identify each input or output. You need to know the ID # when talking to devices locally, however the cloud translates them into names. I.e. for my thermostats the current temperature in degrees F is DPS 29 however the cloud reports it as temp_current_f.

bernd-wechner commented 1 year ago

Is that the same as the deviceid argument to getdevicelog()? And the device_id element of the returned result structure?

uzlonewolf commented 1 year ago

No, the device id identifies the device, while the DPS identifies the individual inputs/outputs on the device. I.e. a 4-channel relay controller will have a device_id for the device itself, plus 4 DP's for the 4 different outputs (usually DPS 1-4). Most devices will have many more DPs than that for things like countdown timers, temperature sensors, current sensors, etc.

bernd-wechner commented 1 year ago

Hmmm, but confused. Is the DP in the returned response a proper of each event? If so, where is it here:

https://github.com/jasonacox/tinytuya/issues/214#issuecomment-1321043692

Forgive me, just trying to get my head around all this.

uzlonewolf commented 1 year ago

Yes and no. The cloud maps the DPS IDs to names they call "code."

Using that linked comment as an example, if I wanted to change the "muffling" value using a local connection I would need to use the numeric DPS ID:

d = tinytuya.Device( '...', '...' )
d.set_value(16, False) # Turn off DPS 16 ("muffling")

To get the DPS<->code mapping you can use c.getdps( device_id )

{
  "result": {
    "category": "ywbj",
    "functions": [
      {
        "code": "muffling",
        "dp_id": 16,
        "type": "Boolean",
        "values": "{}"
      }
    ],
    "status": [
      {
        "code": "smoke_sensor_status",
        "dp_id": 1,
        "type": "Enum",
        "values": "{\"range\":[\"alarm\",\"normal\"]}"
      },
      {
        "code": "smoke_sensor_value",
        "dp_id": 2,
        "type": "Integer",
        "values": "{\"unit\":\"\",\"min\":0,\"max\":100,\"scale\":1,\"step\":1}"
      },
      {
        "code": "battery_state",
        "dp_id": 14,
        "type": "Enum",
        "values": "{\"range\":[\"low\",\"middle\",\"high\"]}"
      },
      {
        "code": "battery_percentage",
        "dp_id": 15,
        "type": "Integer",
        "values": "{\"unit\":\"%\",\"min\":0,\"max\":100,\"scale\":0,\"step\":1}"
      },
      {
        "code": "muffling",
        "dp_id": 16,
        "type": "Boolean",
        "values": "{}"
      }
    ]
  },
...
}

But this is the gotcha with pulling cloud logs: half of the devices I have do not have a proper DPS<->code mapping like that. Those DP's will not show up in the event logs because the server cannot map the DPS ID to a code.

bernd-wechner commented 1 year ago

Ah, thanks for clarifying. "code" is such a weird element IMHO. Sometimes I see it as empty, too. So what I'm hearing is they have a number to string mapping for code, as they do for event IDs and Sources, which I have mapped as:

EVENT_IDS = {1: "online",
             2: "offline",
             3: "device activation",
             4: "device reset",
             5: "command issuance",
             6: "firmware upgrade",
             7: "data report",
             8: "device semaphore",
             9: "device restart",
             10: "timing information"}

EVENT_SOURCES = {-1: "unknown",
                  1: "device itself",
                  2: "client instructions",
                  3: "third-party platforms",
                  4: "cloud instructions"}

But unlike Event IDs and sources the logs don't return the numeric, they return the mapped string for code and sometimes they have no mapping implemented?

uzlonewolf commented 1 year ago

Yes, exactly.

bernd-wechner commented 1 year ago

One page says you get "about 26,000 API calls" but another says Cloud API calls are "Free of charge within monthly usage limits of $ 0.20". I'm leaning towards the limit is per month.

Trying to find the docs on this. Did you keep links? I'm curious to see where they state this and what one fetch costs (toward a possible $0.20 monthly gratis limit).

uzlonewolf commented 1 year ago

Yeah, they were posted in #230. https://developer.tuya.com/en/docs/iot/membership-service?id=K9m8k45jwvg9j gives the 26k figure and https://www.tuya.com/vas/commodity/IOT_CORE_V2 -> "Trial Addition" -> "View More" gives $0.20.

bernd-wechner commented 1 year ago

Thanks! The first page is mystical indeed, and on the second I see:

image

No "Trial Addition" but I see "Trial Edition (one month)" which is selected but I don't see "View More" anywhere.

You're right though Tuya are clear as mud.

uzlonewolf commented 1 year ago

If you mouse-hover over "Flagship" or "Corporate" you get a pop-up with a "View More" link. I could have swore that same pop-up existed for the "Trial" option as well. I wonder if they removed it for some reason.

uzlonewolf commented 1 year ago

I was able to find a reference to $0.20 by going to iot.tuya.com -> Cloud -> Cloud Services and clicking "View Details" for "IoT Core." Under "Usage/Resource Pack Quota" it says "0 / 0.2 USD" while "Quota Refresh" says "Monthly."

bernd-wechner commented 1 year ago

Thanks enormously, that is so cool. Have boiled it down to:

https://developer.tuya.com/en/docs/iot/membership-service?id=K9m8k45jwvg9j

where is written:

Trial Edition (1 month)
Note: Only for individual developers or use in the debugging phase, and commercial use is prohibited
    50 devices
    Trial resource pack: about 26 thousand API calls
    When developers exceed the limit, Tuya will stop the service.

and later on same page:

Trial
3.71 USD/million API calls (for Plan 1 which applies us in Australia)
About 26 thousand API calls

And then on

https://eu.iot.tuya.com/cloud/products/detail?abilityId=1442730014117204014&id=p1668767995023hmaagk&abilityAuth=0&tab=1

it is written:

Resource Pack Name                   Usage/Resource Pack Quota    Quota Refresh    Effective Date         Expiration Date        Status
Cloud Develop Base Resource Trial    0.002176 / 0.2 USD           Monthly          2022-11-18 21:42:25    2022-12-18 21:42:25    In service

And 0.2 USD at 3.71USD/million API calls translates with a little math to 53,908 API calls per month or 1739 API calls/day.

Which is indeed I agree, the most reliable snippet as it is:

  1. Only visible when logged in, so specific to my (your) account
  2. Labelled "My subscriptions"
  3. Clearly states the quota is refreshed Monthly

Very encouraging and all noted in the internal doc of the app I'm putting together now.

bernd-wechner commented 1 year ago

Ouch, I just got: "No permissions. Your subscription to cloud development plan has expired." Now that was a surpise.

bernd-wechner commented 1 year ago

And I can't see any reason on their website. None at all. I can surf to this:

image

and if I try to buy the free 1 month trial I can't, this pops up:

image

And of course the Flagship and Corporate edition are ridiculously expensive (like the price of a car). Am I now without services, after all that research we put into the rate limits and the Base Resource Trial above being a Monthly Quota Refresh.

uzlonewolf commented 1 year ago

Ooof, hopefully it's temporary and resets at the end of the month. I'm many months past the "one month" trial and they haven't shut me off yet.

bernd-wechner commented 1 year ago

They granted me another 6 months. And I've sent them a petition, that for 0.20 USD per month and given they have a class of user called "Individual Developer" that on top of Starter Trial, Flagship and Corporate deals they offer a Hobbyist or Single Developer deal with a 0.2 USD budget into perpetuity (or at least Annual like Flagship and Corporate. No response there yet. At some level, I appealed tot hem that is no skin off their nose, and there is zero chance that folk like us will stump up hundreds let alone 10s of thousands of dollars per annum to get the data from sensors we already bought and paid good cash for!