jasonacox / tinytuya

Python API for Tuya WiFi smart devices using a direct local area network (LAN) connection or the cloud (TuyaCloud API).
MIT License
982 stars 175 forks source link

Complete rewrite of the scanner, now allows force-scanning of IP ranges #252

Closed uzlonewolf closed 1 year ago

uzlonewolf commented 1 year ago

Full discussion is in #177. Created a new PR to keep the changelog clean.

Closes #159 Closes #172 Closes #232

jasonacox commented 1 year ago

Thank you @uzlonewolf !! Merging this and versioning it v1.9.2 for now. Speed improvement is incredible, LW! Blow away.

So, I was able to produce a few error conditions we shouldtreat.

Python3

This happened on first run - unable to reproduce.

$ python3 -m tinytuya scan                  master
/Users/jason/Code/tinytuya/tinytuya/scanner.py:123: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if ask is not 2:

TinyTuya (Tuya device scanner) [1.9.2]

[Loaded devices.json - 36 devices]

Python2

This happened on first run, ran without flaw next several times, then after several tries later, again.

$ python -m tinytuya scan                   master

TinyTuya (Tuya device scanner) [1.9.2]

[Loaded devices.json - 36 devices]

Scanning on UDP ports 6666 and 6667 and 7000 for devices for 18 seconds...

Chandelier   Product ID = qcgkaqmaivuwfwz4  [Valid Broadcast]:
    Address = 10.0.1.36   Device ID = xxxxxxxxxx (len:20)  Local Key = xxxxxxxxxx  Version = 3.3  Type = default, MAC = d8:f1:5b:c9:d5:5b
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/jason/Code/tinytuya/tinytuya/__main__.py", line 75, in <module>
    scanner.scan(color=color, forcescan=force, discover=broadcast_listen, assume_yes=assume_yes)
  File "tinytuya/scanner.py", line 929, in scan
    devices(verbose=True, scantime=scantime, color=color, poll=True, forcescan=forcescan, discover=discover, assume_yes=assume_yes)
  File "tinytuya/scanner.py", line 1310, in devices
    all_socks[sock].write_data()
  File "tinytuya/scanner.py", line 812, in write_data
    self.timeout()
  File "tinytuya/scanner.py", line 793, in timeout
    self.sock.close()
AttributeError: 'NoneType' object has no attribute 'close'
$ python -m tinytuya scan                   master

TinyTuya (Tuya device scanner) [1.9.2]

[Loaded devices.json - 36 devices]

Scanning on UDP ports 6666 and 6667 and 7000 for devices for 18 seconds...

Wasson Computer   Product ID = keym9qkuywghyrvs  [Valid Broadcast]:
    Address = 10.0.1.85   Device ID = xxxxxxxxxx (len:22)  Local Key = xxxxxxxxxx  Version = 3.3  Type = default, MAC = 18:69:d8:fd:fd:5f
    Status: {u'24': 7446, u'25': 4830, u'26': 0, u'20': 1213, u'21': 1, u'22': 1136, u'23': 7223, u'19': 0, u'18': 0, u'1': False, u'9': 0}
Master Bedroom   Product ID = MShdslm9Uw7Q59nN  [Valid Broadcast]:
    Address = 10.0.1.30   Device ID = xxxxxxxxxx (len:20)  Local Key = xxxxxxxxxx  Version = 3.3  Type = default, MAC = c4:4f:33:a9:f5:06
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/jason/Code/tinytuya/tinytuya/__main__.py", line 75, in <module>
    scanner.scan(color=color, forcescan=force, discover=broadcast_listen, assume_yes=assume_yes)
  File "tinytuya/scanner.py", line 929, in scan
    devices(verbose=True, scantime=scantime, color=color, poll=True, forcescan=forcescan, discover=discover, assume_yes=assume_yes)
  File "tinytuya/scanner.py", line 1310, in devices
    all_socks[sock].write_data()
  File "tinytuya/scanner.py", line 812, in write_data
    self.timeout()
  File "tinytuya/scanner.py", line 793, in timeout
    self.sock.close()
AttributeError: 'NoneType' object has no attribute 'close'

Also on Python2 I left netifaces uninstalled so I see this:

$ python -m tinytuya scan -force            master

TinyTuya (Tuya device scanner) [1.9.2]

[Loaded devices.json - 36 devices]

Scanning on UDP ports 6666 and 6667 and 7000 for devices for 18 seconds...

    Option: Network force scanning requested.

    NOTE: netifaces module not available, multi-interface machines will be limited.
           (Requires: pip install netifaces)

    Running Scan...
ERROR: Unable to get network for u'10.0.1.50/24', ignoring.
Traceback (most recent call last):
  File "tinytuya/scanner.py", line 937, in _generate_ip
    network = ipaddress.ip_network(netblock)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/ipaddress.py", line 188, in ip_network
    return IPv4Network(address, strict)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/ipaddress.py", line 1655, in __init__
    raise ValueError('%s has host bits set' % self)
ValueError: 10.0.1.50/24 has host bits set

I now want to add a -d flag to allow debug mode from command line. :-)

I'll keep running test.

uzlonewolf commented 1 year ago

I now want to add a -d flag to allow debug mode from command line. :-)

I recently discovered a module that adds tab-completion to argparse, so I'll most likely rewrite __main__.py to use that soon :)

I'll see if I can fix those errors tomorrow.

jasonacox commented 1 year ago

tab-completion to argparse

Love that! I have some older 3.1 devices that stop broadcasting after a while but are otherwise working. They are discovered with -force so it was a fun test.

I'll see if I can fix those errors tomorrow.

Thanks LW. Some of the errors seem to coincide with MacOS prompting for permission for python to listen to the network. I'm guessing a timeout condition we could catch. I haven't dug in yet. No rush.

This is beautiful! Scan completed in 37.4544 seconds = force scan found 26 devices Scan completed in 18.0128 seconds but had discovered all 10s earlier (8s) = regular scan

uzlonewolf commented 1 year ago

All those errors are fixed in #254 .

At some point I'm going to rewrite the wizard to use the new scanner. If you pass a list of device IDs into devices( ..., wantids=[a,b,c,...] ) it'll end the scan as soon as all the listed devices are found without waiting the full scan time.

jasonacox commented 1 year ago

Thanks @uzlonewolf ! I can't reproduce those errors now so they do appear fixed. However, I ran into some others with the rest of the testing:

The snapshot.json seems to have duplicate records after the scan (but missing IP):

$ python3 -m tinytuya snapshot                                          master

TinyTuya (Tuya device scanner) [1.9.2]

Loaded snapshot.json - 59 devices:

Name                      ID                       IP                 Key               Version

                          AAAAAAAAAAAAAA7abf     10.0.1.96                            3.3  
                          BBBBBBBBBBBBBBd591     10.0.1.22                            3.3  
Air Purifier              XXXXXXXXXXXX40a19f     10.0.1.46          XXXXXXXXXXXXXX  3.3  
Air Purifier              XXXXXXXXXXXX40a19f     Error: No IP found XXXXXXXXXXXXXX  0    
Chandelier                YYYYYYYYYYYYY9d55b     10.0.1.36          XXXXXXXXXXXXXX  3.3  
Chandelier                YYYYYYYYYYYYY9d55b     Error: No IP found XXXXXXXXXXXXXX  0    

On python3, it actually throws some exceptions after rendering the table - this was repeated several times and likely related to the first two device entries w/o keys (these are new devices I have added and not pulled down keys with wizard yet just to provide a test). Oddly, python2 doesn't throw the errors but the table has the duplicate devices (with 2nd one missing IP).

Polling 60 local devices from last snapshot...
Traceback (most recent call last):
  File "/Users/jason/Code/tinytuya/tinytuya/scanner.py", line 842, in write_data
    self.sock.sendall( self.device._encode_message( self.device.generate_payload(tinytuya.DP_QUERY) ) )
  File "/Users/jason/Code/tinytuya/tinytuya/core.py", line 1158, in _encode_message
    payload = self.cipher.encrypt(payload, False)
  File "/Users/jason/Code/tinytuya/tinytuya/core.py", line 226, in encrypt
    cipher = AES.new(self.key, mode=AES.MODE_ECB)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/AES.py", line 232, in new
    return _create_cipher(sys.modules[__name__], key, mode, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/__init__.py", line 79, in _create_cipher
    return modes[mode](factory, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/_mode_ecb.py", line 216, in _create_ecb_cipher
    cipher_state = factory._create_base_cipher(kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/AES.py", line 93, in _create_base_cipher
    raise ValueError("Incorrect AES key length (%d bytes)" % len(key))
ValueError: Incorrect AES key length (0 bytes)

Since python3 -m tinytuya json is similar to snapshot it produces the same error.

Now, if you run python3 -m tinytuya devices and then run python3 -m tinytuya snapshot the duplicate devices (without IP) are removed (and no exceptions since it removes the devices w/o keys).

At some point I'm going to rewrite the wizard to use the new scanner

Yes!

jasonacox commented 1 year ago

I found the culprit for the duplicates and pushed a fix.

image

I'll work on the next one. I know you are working on the 3.5 devices. ;)

uzlonewolf commented 1 year ago

BTW, what I like doing when adding a new device is starting a really long scan such as python3 -m tinytuya scan 999 and letting it settle out before adding the device. Helps identify the ID for the new device. Due to how it handles \<ctrl>-c you can then kill it whenever you want and it'll still display the statistics and write out the log.

1st \<ctrl>-c - Timer is cancelled and force-scan is stopped. Any in-flight status requests are allowed to continue. 2nd \<ctrl>-c - In-flight status requests are immediately aborted. 3rd \<ctrl>-c - Immediate abort everything and exit. Due to the 2nd being rather quick you should never get to this point.

jasonacox commented 1 year ago

Nice!

Your latest PR: 3.1 device discovery is fixed!

The remaining bug can be reproduced by doing this:

This happens for each device:

Traceback (most recent call last):
  File "/Users/jason/Code/tinytuya/tinytuya/scanner.py", line 842, in write_data
    self.sock.sendall( self.device._encode_message( self.device.generate_payload(tinytuya.DP_QUERY) ) )
  File "/Users/jason/Code/tinytuya/tinytuya/core.py", line 1282, in _encode_message
    payload = self.cipher.encrypt(payload, False)
  File "/Users/jason/Code/tinytuya/tinytuya/core.py", line 242, in encrypt
    cipher = AES.new(self.key, mode=AES.MODE_ECB)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/AES.py", line 232, in new
    return _create_cipher(sys.modules[__name__], key, mode, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/__init__.py", line 79, in _create_cipher
    return modes[mode](factory, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/_mode_ecb.py", line 216, in _create_ecb_cipher
    cipher_state = factory._create_base_cipher(kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/Crypto/Cipher/AES.py", line 93, in _create_base_cipher
    raise ValueError("Incorrect AES key length (%d bytes)" % len(key))
ValueError: Incorrect AES key length (0 bytes)

The issue is that there is no local key. There are many places where we could check for a missing key and inject one or branch an error. I believe the fix is likely in devices() as called in scanner.py on line 1685 or the params sent (snapshot=by_ip). Also note, on the final output of snapshot, the device IDs are missing. I'm trying to get my head around all the flows to see where to handle the no-key case of a "no devices.json" scan. It's probably a simple fix.

jasonacox commented 1 year ago

Oops... we have a global name typo. I ran a test against a 3.1 device:

import tinytuya
d = tinytuya.OutletDevice(DEVICEID, DEVICEIP, DEVICEKEY)
print(d)
d.set_version(3.1)
print(d.status())

Traceback (most recent call last): File "/Users/jason/Code/tinytuya/sandbox/tinytuya/core.py", line 952, in _send_receive rmsg = self._receive() File "/Users/jason/Code/tinytuya/sandbox/tinytuya/core.py", line 840, in _receive prefix_len = len( PREFIX_BIN_55AA ) NameError: name 'PREFIX_BIN_55AA' is not defined

That PREFIX_BIN_55AA should be PREFIX_55AA_BIN based on:

PREFIX_BIN = PREFIX_55AA_BIN = b"\x00\x00U\xaa"
jasonacox commented 1 year ago

Ok, I'm not super happy with the fix (using a 'f'*16 key if missing in DeviceDetect.connect()) as it slows things down if the devices is new/missing from devices.json, but it no longer throws any exceptions.

I also set the display logic in _display_status() and snapshot() to show gwId as name if name is empty.

jasonacox commented 1 year ago

Last odd bit I'm tackling -

Force seems to zero out the gwId in certain cases. I don't know if this is a bug or the fact that the force scan found the device before the broadcast with the ID comes through.

Name                      ID                       IP                 Key               Version

0                         0                        10.0.1.13                            3.3  
0                         0                        10.0.1.14                            3.3  
0                         0                        10.0.1.18                            3.3  
0                         0                        10.0.1.22                            3.3  
0                         0                        10.0.1.31                            0    
0                         0                        10.0.1.32                            3.3  
0                         0                        10.0.1.36                            3.3  
0                         0                        10.0.1.37                            3.3  
0                         0                        10.0.1.44                            3.3  
0                         0                        10.0.1.45                            3.3  
0                         0                        10.0.1.47                            3.3  
0                         0                        10.0.1.48                            3.3  
0                         0                        10.0.1.54                            3.3  
0                         0                        10.0.1.57                            0    
0                         0                        10.0.1.62                            0    
0                         0                        10.0.1.83                            3.3  
uzlonewolf commented 1 year ago

Even if force-scan finds it first, if it also gets a broadcast then it uses said broadcast since those packets have more information in them.

In my testing, the null-id issue only happens when all 3 conditions are met:

I'm working on making the force-scan a bit more robust, but there's nothing that can be done if the device is unknown and does not have a known local key. As a kludge we can delay the force-scan by ~6 seconds or so to give it a chance to broadcast, or increase the scan time to force-scan time + max time until forced-stop + ~6.

jasonacox commented 1 year ago

That makes sense. I have about 27 Tuya devices in my test. I can get it to work if I force scantime to 45s but doesn't find them all if I lower it to 30s (misses 15). At 60s, all of the broadcast come in except for 2 older 3.1 devices that have a tendency to go into a bad state and stop broadcasting anyway. That is getting very close to 100% blind discovery using -force.

Key point you made is that this edge case is only when we are missing vital devices.json data. My suggestion: I don't want to arbitrarily increase the scantime, but if the user is missing the devices.json file and doesn't specify a scantime, we could auto-increment it with a warning, "NOTICE: No devices.json file found (run wizard to generate), increasing scantime to 60s to increase discovery." It won't help with VLAN boundaries blocking broadcasts, but should capture the majority of devices for most users.

I'm probably way over thinking this but my hunch is that many users who install tinytuya fire up a scan even before running wizard.

uzlonewolf commented 1 year ago

IMO using force-scan without any devices.json at all is an illegal combination and as such we should throw an error. Due to devices not responding unless you know the key it is useless unless you just want a (very poor) port scanner. If not an error then we should at least print a warning telling the user that force-scan won't work without devices.json. I'm thinking something like:

if not len(devices) and forcescan:
    if not discover:
        raise 'Force-scan requires a devices.json'
    else:
        forcescan = False
        print 'Warning: Force-scan requires devices in devices.json.  Force-scan disabled, only using passive broadcast discovery'
jasonacox commented 1 year ago

force-scan without any devices.json at all is an illegal combination

Well said. 😁 Excellent points. Sold!

Does this approach raise an exception for non-CLI initiated scans w/o devices.json but automatically disables force-scan for CLI initiated scans? If so, I love it. I would like to provide friendly (not exception trace output) for command line users where possible. But in any case, agree on approach to avoid the "illegal combination" (which translates to less support issues to answer 😂 ).

uzlonewolf commented 1 year ago

Not really. It doesn't care about CLI-vs-non-CLI. If it was called as the equivalent of python3 -m tinytuya scan -force .../24 -no-broadcasts (i.e. scanner.scan( ..., forcescan=..., discover=False)) then it raises an exception. Otherwise if it was python3 -m tinytuya scan -force .../24 then it just pretends -force was not provided.

jasonacox commented 1 year ago

I get it. The point being that you can't downgrade to broadcast if "-no-broadcasts" requested.

uzlonewolf commented 1 year ago

So, how close do we need the snapshot file to be between force-scanned devices and broadcasted devices? Currently broadcasted devices have "version" set to "3.3" (string) and no "ver" while force-scanned devices have "version set to 3.3 (float) with "ver" also set to 3.3.

jasonacox commented 1 year ago

Ugh, we should try to converge. There shouldn't be a need to have both. I suggest we:

The origin field would be one of this set: {"cloud", "broadcast", "forcescan"}

[
    {
          "ip": "10.2.3.44",
          "origin": "broadcast",
          "gwId": "0123456789abcdef0123",
          "active": 2,
          "ability": 0,
          "mode": 0,
          "encrypt": true,
          "productKey": "MShdslm9Uw7Q59nN",
          "version": "3.3",
          "name": "Light Switch",
          "key": "0123456789abcdef",
          "mac": "c1:d2:e3:a9:f5:06",
          "ablilty": "",
          "token": "",
          "wf_cfg": "",
          "dev_type": "default",
          "err": "",
          "type": "default",
          "dps": {
              "devId": "0123456789abcdef0123",
              "dps": {
                  "1": true,
                  "9": 0
              }
          }
    }
]

Thoughts?

uzlonewolf commented 1 year ago
  • Use "version" (string) for both (consistency with history in case someone is using snapshot.json for other things)

I do prefer "version", however the older snapshot.json used "ver" (and "id" instead of gwId"). At least in some places. I remember it being a real mess, which is why I had it saving both. From a snapshot file created by 1.9.1:

{
    "timestamp": 1673911373.1963036,
    "devices": [
        {
            "name": "Wired Smart Gateway",
            "ip": 0,
            "ver": 0,
            "id": "ebf4...ka",
            "key": "a3..."
        },
        {
            "name": "Kitchen Light 6",
            "ip": 0,
            "ver": 0,
            "id": "eb97...pv",
            "key": "9a..."
        },
        ...

I already do some normalization when reading from it, it wouldn't be difficult to do the same thing when saving.

And I really like that origin addition.

jasonacox commented 1 year ago

Oh yes, you are right... just checked the old snapshot example code I created a few years ago. Embarrassing. 😊

        item["name"],
        item["id"],
        item["ip"],
        item["key"],
        item["ver"]

I see your PR and agree. Let's stick with this for the snapshot and do the conversion with your functions to help normalize the data. It will keep from breaking any code that others have built based on the older format and example.

jasonacox commented 1 year ago

Hey @uzlonewolf, found a minor / odd edge case on a Windows 11 system with the scanner (snapshot). I'm using a devices.json file that is intentionally missing a few new devices and has the wrong key for at least one (my typical test pattern). I'm getting this error repeatedly for snapshot (no errors in scan), but it then displays the expected table including the test case, "No response" and "No IP found" devices.

Traceback (most recent call last):
  File "C:\Users\jason\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\tinytuya\scanner.py", line 800, in write_data
    self.sock.sendall( self.device._encode_message( self.device.generate_payload(tinytuya.DP_QUERY) ) )
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

I can't reproduce this on MacOS or Linux. We could add a try/except block to eliminate the noise, but it is odd that it shows up only on one platform.

uzlonewolf commented 1 year ago

It's in a try/except block, it's just that there is a print(traceback) in the except. Perhaps wrap that print() in a if verbose ? It's mainly an informational notice, not a warning/error. I've seen it on occasion on Linux if the first packet isn't immediately sent as soon as the connection's opened (due to the select() it can take 100-200ms between the connection being opened and the first packet).