Ankermgmt / ankermake-m5-protocol

Ankermake M5 protocol specifications and libraries *NOT AFFILIATED WITH ANKER*
GNU General Public License v3.0
160 stars 33 forks source link

Video streams break after a while (race condition?) #40

Closed billyjbryant closed 1 year ago

billyjbryant commented 1 year ago

Note: This issue detected in both the master and the webserver-improvements branch

Description

Cannot capture video from the AnkerMake M5 using python .\ankerctl.py pppp capture-video output.h264

I have attempted multiple times and I get varying lengths of captured video output but it inevitably will always error out with an error similar to the following:

Traceback (most recent call last):
  File "C:\Users\billy\Projects\ankermake-m5-protocol\ankerctl.py", line 517, in <module>
    main()
  File "c:\python311\Lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\decorators.py", line 84, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\billy\Projects\ankermake-m5-protocol\ankerctl.py", line 277, in pppp_capture_video
    d = api.recv_xzyh(chan=1)
        ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\ppppapi.py", line 347, in recv_xzyh
    xzyh = Xzyh.parse(fd.read(16))[0]
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\pppp.py", line 282, in parse
    magic, p = Magic.parse(p, 4, b'XZYH')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\amtypes.py", line 70, in parse
    _assert_equal(v, expected)
  File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\amtypes.py", line 15, in _assert_equal
    raise ValueError(f"expected {expected} but found {value}")
ValueError: expected b'XZYH' but found b'\x94\xca\xabq'

The same happens when attempting to view the webserver video stream:

Expand to see error ```python ankermake-m5-protocol on  chrivers/webserver-improvements [?] via 🐍 v3.11.0 on ☁️ (us-west-2) ❯ python .\ankerctl.py webserver run WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Serving Flask app 'web' * Debug mode: off [*] WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Running on http://127.0.0.1:4470 [*] Press CTRL+C to quit [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET / HTTP/1.1" 200 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/vendor/cash.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/ankersrv.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [07/Apr/2023 17:10:45] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 304 - [*] MqttQueue: Requesting start [*] MqttQueue: Starting worker [*] VideoQueue: Requesting start [*] VideoQueue: Starting worker [*] MqttQueue worker start [*] Connecting to make-mqtt.ankermake.com [*] VideoQueue worker start [*] Trying connect over pppp [*] Established pppp connection [*] VideoQueue: Worked started [*] Connected to mqtt [*] MqttQueue: Worked started Exception in thread VideoQueue: Traceback (most recent call last): File "c:\python311\Lib\threading.py", line 1038, in _bootstrap_inner self.run() File "C:\Users\billy\Projects\ankermake-m5-protocol\web\__init__.py", line 95, in run self.worker_run(timeout=0.3) File "C:\Users\billy\Projects\ankermake-m5-protocol\web\__init__.py", line 198, in worker_run d = self.api.recv_xzyh(chan=1, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\ppppapi.py", line 365, in recv_xzyh xzyh = Xzyh.parse(hdr)[0] ^^^^^^^^^^^^^^^ File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\pppp.py", line 282, in parse magic, p = Magic.parse(p, 4, b'XZYH') ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\amtypes.py", line 70, in parse _assert_equal(v, expected) File "C:\Users\billy\Projects\ankermake-m5-protocol\libflagship\amtypes.py", line 15, in _assert_equal raise ValueError(f"expected {expected} but found {value}") ValueError: expected b'XZYH' but found b'\x05o\xb6e' [*] TOPIC [/phone/maker/AK7ZRG0C35700749/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700749/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700749/notice] ```

System Information:

Expand Python info... ```text ❯ python -V Python 3.11.0 ❯ pip list Package Version ------------------ --------- certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 colorama 0.4.6 crcmod 1.7 Flask 2.2.0 flask-sock 0.6.0 h11 0.14.0 idna 3.4 itsdangerous 2.1.2 Jinja2 3.1.2 markdown-it-py 2.2.0 MarkupSafe 2.1.2 mdurl 0.1.2 paho-mqtt 1.6.1 pip 22.3 platformdirs 3.1.1 pycryptodomex 3.17 Pygments 2.14.0 requests 2.28.2 rich 13.3.1 setuptools 65.5.0 simple-websocket 0.9.0 tinyec 0.4.0 tqdm 4.65.0 transwarp 0.12.0 urllib3 1.26.15 Werkzeug 2.2.3 wsproto 1.2.0
Expand Windows info... ```text ❯ C:\Windows\System32\systeminfo Host Name: JOVIAN OS Name: Microsoft Windows 11 Home OS Version: 10.0.22623 N/A Build 22623 OS Manufacturer: Microsoft Corporation OS Configuration: Standalone Workstation OS Build Type: Multiprocessor Free Original Install Date: 5/14/2022, 9:28:05 PM System Boot Time: 4/7/2023, 11:58:04 AM System Manufacturer: Alienware System Model: Alienware m15 R6 System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 141 Stepping 1 GenuineIntel ~2304 Mhz BIOS Version: Alienware 1.20.0, 2/6/2023 Total Physical Memory: 32,515 MB Available Physical Memory: 16,913 MB Virtual Memory: Max Size: 40,707 MB Virtual Memory: Available: 19,375 MB Virtual Memory: In Use: 21,332 MB Page File Location(s): C:\pagefile.sys Domain: WORKGROUP Hotfix(s): 6 Hotfix(s) Installed. [01]: KB5022497 [02]: KB5007297 [03]: KB5012170 [04]: KB5018863 [05]: KB5022914 [06]: KB5022610 Network Card(s): 3 NIC(s) Installed. [01]: Killer(R) Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) Connection Name: Wi-Fi DHCP Enabled: Yes DHCP Server: 192.168.4.1 IP address(es) [01]: 192.168.4.102 [02]: fe80::32d5:30e4:1cc3:aa07 [02]: Killer E2600 Gigabit Ethernet Controller Connection Name: Ethernet Status: Media disconnected [03]: Bluetooth Device (Personal Area Network) Connection Name: Bluetooth Network Connection Status: Media disconnected Hyper-V Requirements: A hypervisor has been detected. Features required for Hyper-V will not be displayed. ```
spuder commented 1 year ago

Thanks for reporting. This is helpful. @chrivers thoughts?

I did some testing and I ran into an almost identical error (on a different commit 0043361f8a02bd023005353b601c5485519f9a8d)

Exception in thread VideoQueue:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 95, in run
    self.worker_run(timeout=0.3)
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 198, in worker_run
    d = self.api.recv_xzyh(chan=1, timeout=timeout)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/ppppapi.py", line 365, in recv_xzyh
    xzyh = Xzyh.parse(hdr)[0]
           ^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/pppp.py", line 282, in parse
    magic, p = Magic.parse(p, 4, b'XZYH')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 70, in parse
    _assert_equal(v, expected)
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal
    raise ValueError(f"expected {expected} but found {value}")
ValueError: expected b'XZYH' but found b'h\x97\x1a\x80'
chrivers commented 1 year ago

Hmm, this is weird indeed. We're somehow getting ourselves desynchronized with the video stream, when capturing.

What's worse, it's not happening to me... which makes it a little harder to debug :sweat_smile:

I'm a bit busy today with personal stuff, but I'll try to whip up a branch which can capture some of the raw packet data for further analysis. I'm sure there's a reasonable explanation for what is going on here.

chrivers commented 1 year ago

I'm making a new feature for capturing traffic, as a debug option. This should help figure out what's going on. stay tuned :)

chrivers commented 1 year ago

I've added a ./ankerctl.py --pppp-dump pppp.log .... option to chrivers/webserver-improvements (since that has the most recent video streaming code)

Can both of you please try to reproduce the error, capturing a logfile that captures the problem?

The logfile is opened in append mode, so you can run it multiple times in a row without issues.

The size can grow quite quickly, especially when capturing video, since the raw data is hex-encoded - so please compress the log before sending them.

chrivers commented 1 year ago

I forgot to mention something!

The logfile is somewhat sensitive, since it contains the master key for pppp access to the printer. Consider taking reasonable precautions. For example, feel free to send directly to my email: chrivers@gmail.com, or uploading an encrypted .zip, and sending the password through a non-public channel.

Or, encrypt to my GPG pubkey, and send the result anywhere:

-----BEGIN PGP PUBLIC KEY BLOCK-----

mQENBFjSQRcBCAC3rf9KLFbPlptZyPo4pBXBxtrX1RcqcGS/cVOudJE2Uj1V10bd
VEyXTvapg4Kd2KhHLvKstANOwMaiMqb1vhN/6t8b5ncsyygAe8jX/3gx6Tnh6m+p
kKQAN+kDffO0zUpQN7HTRMOcSM4DMvpXkYnuyHcScAXrBiD+lehiMuGNXmlQRIKa
7Lnl0126fRvo1gzCCV2k9H4KGCDCEsaQ6RWHasEvllxpaM4H//HqpwPZp8UzZgem
J12J8xTDoDrmVAUR5vGG/vs0FDszpVqA3Qqyk2WUNWK8gtOlUx/tkV5zKH7NidHf
E+8LhvDHal8KxUQ1kMtPtkOMSCcOBfLVM4EDABEBAAG0I0NocmlzdGlhbiBJdmVy
c2VuIDxjaUBpdmVyc2VuaXQuZGs+iQE5BBMBCAAjBQJY0kEXAhsDBwsJCAcDAgEG
FQgCCQoLBBYCAwECHgECF4AACgkQnQubvtzCS7xfJwf9HFJQP1FemdPAGN5m4jOk
YwuH+sUI43HQ33/iySJWyMmpXVOvgIglzlZK72+TiQpuAVFErFiIGl6Jnc+ntiSK
S2telio1NBPMA9CXwmtZTl++cPgnkpifajJur9S3dE/8WkmYxmiQocZ7A/6fDnRz
aTwAFmqy7Zkuq9QTn9/xH8B5gLw3Zgv2b4wgxXLukaW/QsBiGhz+ph4J1oTbzFiG
E3PQS215t85TZJfaXGzVBgDaNmV1YVunj49TNnjNBKqTo9nuwesKuiE6ExpjsSV7
sNVFChbAv+/qpfZz3/BtDTdsR4Krd/t10GgsFAEUiW+UmR9dPtVvUci9WclVK+6X
wrkBDQRY0kEXAQgAoMtMg807n+YmJmrqeIIoRA7afT5JGHm/VvUHn4GIAc3Y6SoD
YnwrDuhYsph5ItbCGxzlRUfal/DZJf7g0vngbD1R0o5vxjaj3ucO0b1RjKVrcb4S
5VH8TMSYV5yljxOUavk+2yrorZd6nV5a9XLVRS4uKeWqKOBGxy9yIWbWVJO/PIt/
YpD+35gTtiVnUz3izYry6daZRmpkJ3jf3ss8+B+3LnI7gOGEdMW4bvCc4UYKXSWh
es2f1hUHu6Admcj8d9sSb810GFt4S1pYrJCycTCXesbUFQQJ5n5eQtdmPN/XGIhP
4G9QgOfbHKS/oJ+K1CsZg0ssZPgotOCfE92g/QARAQABiQEfBBgBCAAJBQJY0kEX
AhsMAAoJEJ0Lm77cwku83KEH/1u/uDk6EjVeJzhNkxGX7Zy9MJHsTHVp1SUqFVzI
KjETlbxLFeVBOFlJXkxFyx6Mm4tTr2u6HVZ+ph1CpOiYeQU0dUp/gTF/9IG7kH9I
GB1xZvXh6A2ipieRPijSrR90Od9aQCM3gqpa5u+kHenLDLsCnbsVhDES75WYsvls
P/qwhXGExftgPqi/3nuHPqyOBROprx6PKrJyZkV+dyLw0katxK8KFqH7PtEdyxQx
NpxC3gAuY9DC8HxDBucbtUNrTt0DuW4s5hhcdhA2Qesn9jEkWTJ+djiUmloR0yxt
bnLAygQOux5N81txUqwU2AqUgW6J9D6+f2WWIdu4MWdeJoM=
=uGLO
-----END PGP PUBLIC KEY BLOCK-----
spuder commented 1 year ago

I've reproduced this while running with debug logs. I've compressed and encrypted the logs and sent them out of band for review.

chrivers commented 1 year ago

I think I have narrowed down the cause, but the mechanism of this error is still somewhat complex.

More research needed :D

spuder commented 1 year ago

I ran another test to try and tell if this is caused by bad wifi. I did not have the full logging running at the time, but I'm still sharing because there is an interesting pattern.

You'll notice a bunch of 'Dropping old packet' messages early in the stream. Those messages reduce in frequency after a short period before the crash happens.

logs

```bash * Running on http://127.0.0.1:4470 [*] Press CTRL+C to quit [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET / HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/vendor/cash.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/ankersrv.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [09/Apr/2023 19:19:52] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 200 - [*] MqttQueue: Requesting start [*] MqttQueue: Starting worker [*] MqttQueue worker start [*] VideoQueue: Requesting start [*] VideoQueue: Starting worker [*] VideoQueue worker start [*] Connecting to make-mqtt.ankermake.com [*] Trying connect over pppp [*] Connected to mqtt [*] MqttQueue: Worked started [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] VideoQueue worker start [*] Trying connect over pppp [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] VideoQueue worker start [*] Trying connect over pppp [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] VideoQueue worker start [*] Trying connect over pppp [E] CLOSE [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] VideoQueue worker start [*] Trying connect over pppp [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [E] CLOSE [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] VideoQueue worker start [*] Trying connect over pppp [E] CLOSE [E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second. [*] VideoQueue worker start [*] Trying connect over pppp [E] CLOSE [*] Established pppp connection [*] VideoQueue: Worked started [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 78 while expecting 199. [W] Dropping old packet: index 1109 while expecting 1230. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 1359 while expecting 1470. [W] Dropping old packet: index 1360 while expecting 1470. [W] Dropping old packet: index 1361 while expecting 1470. [W] Dropping old packet: index 1362 while expecting 1470. [W] Dropping old packet: index 1363 while expecting 1470. [W] Dropping old packet: index 1562 while expecting 1665. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 2418 while expecting 2528. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2418 while expecting 2539. [W] Dropping old packet: index 2787 while expecting 2897. [W] Dropping old packet: index 2941 while expecting 3049. [W] Dropping old packet: index 3541 while expecting 3650. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 3542 while expecting 3650. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3542 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3658. [W] Dropping old packet: index 3541 while expecting 3659. [W] Dropping old packet: index 3542 while expecting 3659. [W] Dropping old packet: index 4603 while expecting 4724. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 4734 while expecting 4838. [W] Dropping old packet: index 4734 while expecting 4853. [W] Dropping old packet: index 4734 while expecting 4853. [W] Dropping old packet: index 4734 while expecting 4854. [W] Dropping old packet: index 4734 while expecting 4854. [W] Dropping old packet: index 4734 while expecting 4855. [W] Dropping old packet: index 5154 while expecting 5267. [W] Dropping old packet: index 5614 while expecting 5720. [W] Dropping old packet: index 5615 while expecting 5720. [W] Dropping old packet: index 5616 while expecting 5720. [W] Dropping old packet: index 5617 while expecting 5720. [W] Dropping old packet: index 5618 while expecting 5720. [W] Dropping old packet: index 5619 while expecting 5720. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [W] Dropping old packet: index 5774 while expecting 5881. [W] Dropping old packet: index 5775 while expecting 5881. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 6033 while expecting 6150. [W] Dropping old packet: index 6909 while expecting 7027. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 6910 while expecting 7027. [W] Dropping old packet: index 6911 while expecting 7027. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6910 while expecting 7028. [W] Dropping old packet: index 6911 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6910 while expecting 7028. [W] Dropping old packet: index 6911 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6910 while expecting 7028. [W] Dropping old packet: index 6911 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6910 while expecting 7028. [W] Dropping old packet: index 6911 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6910 while expecting 7028. [W] Dropping old packet: index 6911 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7028. [W] Dropping old packet: index 6909 while expecting 7029. [W] Dropping old packet: index 6910 while expecting 7029. [W] Dropping old packet: index 6911 while expecting 7029. [W] Dropping old packet: index 6909 while expecting 7030. [W] Dropping old packet: index 6909 while expecting 7030. [W] Dropping old packet: index 6910 while expecting 7030. [W] Dropping old packet: index 6911 while expecting 7030. [W] Dropping old packet: index 7990 while expecting 8111. [W] Dropping old packet: index 7990 while expecting 8111. [W] Dropping old packet: index 7990 while expecting 8111. [W] Dropping old packet: index 7990 while expecting 8111. [W] Dropping old packet: index 7990 while expecting 8111. [W] Dropping old packet: index 7990 while expecting 8111. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 8342 while expecting 8451. [W] Dropping old packet: index 8777 while expecting 8883. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 8919 while expecting 9031. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [W] Dropping old packet: index 8919 while expecting 9040. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 10943 while expecting 11058. [W] Dropping old packet: index 11075 while expecting 11187. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 26152 while expecting 26273. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 27307 while expecting 27428. [W] Dropping old packet: index 27726 while expecting 27847. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 32464 while expecting 32585. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 33249 while expecting 33370. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 41913 while expecting 42034. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 43493 while expecting 43607. [W] Dropping old packet: index 43493 while expecting 43607. [W] Dropping old packet: index 43493 while expecting 43607. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 58922 while expecting 59043. [W] Dropping old packet: index 59325 while expecting 59446. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 60508 while expecting 60629. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 63677 while expecting 63798. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 128 while expecting 249. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 5454 while expecting 5555. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 15587 while expecting 15708. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 19940 while expecting 20061. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 21528 while expecting 21649. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 31395 while expecting 31515. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 36909 while expecting 37030. [W] Dropping old packet: index 37344 while expecting 37445. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 41241 while expecting 41362. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 42812 while expecting 42933. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 43990 while expecting 44111. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 50327 while expecting 50448. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 54256 while expecting 54377. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [W] Dropping old packet: index 172 while expecting 290. [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] Exception in thread VideoQueue: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 95, in run self.worker_run(timeout=0.3) File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 198, in worker_run d = self.api.recv_xzyh(chan=1, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/ppppapi.py", line 373, in recv_xzyh xzyh = Xzyh.parse(hdr)[0] ^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/pppp.py", line 282, in parse magic, p = Magic.parse(p, 4, b'XZYH') ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 70, in parse _assert_equal(v, expected) File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal raise ValueError(f"expected {expected} but found {value}") ValueError: expected b'XZYH' but found b'1i\x8c5' [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] ```

chrivers commented 1 year ago

This is very interesting! Thanks for including it!

First of, some of these warnings are completely benign, like this one:

[W] Dropping old packet: index 78 while expecting 199.

That just means we are getting a retransmission of a single packet we have already seen a while ago. It's completely expected that this happens from time to time, since udp is not a reliable protocol (even more so over wifi).

However, this very long passage:

...

[W] Dropping old packet: index 8919 while expecting 9040.
[W] Dropping old packet: index 8919 while expecting 9040.
[W] Dropping old packet: index 8919 while expecting 9040.
...

indicates something very strange going on.

You see, after setting up the connection, and doing the fundamental handshaking, there are mostly 2 types of important packets: DRW and DRW_ACK. The DRW packet is for "Device Read/Write" as far as I can piece together (and not "DRAW", which I keep reading it as), while DRW_ACK is a packet indicating the successful receipt of an earlier DRW packet.

Each end of the connection will send DRW to the other side, and expect to receive DRW_ACK messages accordingly.

So for this to work, each side will also send DRW_ACK packets corresponding to all received DRW packets.

The DRW_ACK packet can bundle many ACK numbers together in a single bundle, which is not a feature we use (yet). However, we do unconditionally send a DRW_ACK packet for every incoming DRW packet.

So for your situation to happen, the printer is convinced that you have not yet received packet 8919, and keeps resending it over and over. And every time you receive it, you send a DRW_ACK back to the printer, but something about this clearly doesn't work as intended.

So there could be an unusual problem with your wifi. Perhaps these very small DRW_ACK packets (which are always transmitted immediately after an incoming ACK packet) are somehow dropped or lost.

I have to say again, the number of retransmissions is highly surprising. Even with flaky wifi, I wouldn't expect this many.

One other problem, which may or may not be related, is something I'm currently working on a robust solution for; counter wrap-around.

The index counter used for DRW and DRW_ACK is an unsigned 16-bit number, so it goes to 0xFFFF (65536) before wrapping back to 0.

Since we use this number to keep track of what the next expected packet is, this does cause problems on long-running transmissions. Especially for data-heavy things like video streams.

But since we have also (at least ocassionally) seen this error on very short streams, that doesn't seem to be the full explanation.

chrivers commented 1 year ago

@spuder, @billyjbryant

I have implemented improved logic for dealing with wrapping packet counters when streaming data.

This may or may not help with the issue you are seeing, but I would appreciate if you could test again. Any input is appreciated, so we can get to the bottom of this. At least we have (hopefully) eliminated one source of problems in this version.

Like before, this fix is only in the experimental chrivers/webserver-improvements branch

billyjbryant commented 1 year ago

@spuder, @billyjbryant

I have implemented improved logic for dealing with wrapping packet counters when streaming data.

This may or may not help with the issue you are seeing, but I would appreciate if you could test again. Any input is appreciated, so we can get to the bottom of this. At least we have (hopefully) eliminated one source of problems in this version.

Like before, this fix is only in the experimental chrivers/webserver-improvements branch

@chrivers I'm dealing with bouts of weakness from COVID, but I will check it out as soon as I can and let you know.

spuder commented 1 year ago

It's looking better (less dropping old packet warnings), but I was still able to reproduce after just a few minutes. I wasn't running detailed logging at the time. I can try another test with the full debug logs

Running on commit 80d4c2d565cfbcfff379c10e3d9b617a33928002

I refreshed the page and got the error

  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/simple_websocket/ws.py", line 95, in send
    self.sock.send(out_data)`

I few seconds later the main issue came back

  File "/Users/spencer.owen/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal
    raise ValueError(f"expected {expected} but found {value}")
ValueError: expected b'XZYH' but found b'\x05\xff\xd1\xd6'
logs

```bash [*] MqttQueue: Worked started [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET / HTTP/1.1" 200 - [E] Exception on /ws/video [GET] Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/flask/app.py", line 2463, in wsgi_app response = self.full_dispatch_request() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/flask/app.py", line 1760, in full_dispatch_request rv = self.handle_user_exception(e) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/flask/app.py", line 1758, in full_dispatch_request rv = self.dispatch_request() ^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/flask/app.py", line 1734, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/flask_sock/__init__.py", line 62, in websocket_route f(ws, *args, **kwargs) File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 237, in video sock.send(data) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/simple_websocket/ws.py", line 95, in send self.sock.send(out_data) BrokenPipeError: [Errno 32] Broken pipe [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /ws/video HTTP/1.1" 500 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/vendor/cash.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/ankersrv.js HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:38:20] "GET /favicon.ico HTTP/1.1" 404 - [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] Exception in thread VideoQueue: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 95, in run self.worker_run(timeout=0.3) File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 198, in worker_run d = self.api.recv_xzyh(chan=1, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/ppppapi.py", line 375, in recv_xzyh xzyh = Xzyh.parse(hdr)[0] ^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/pppp.py", line 282, in parse magic, p = Magic.parse(p, 4, b'XZYH') ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 70, in parse _assert_equal(v, expected) File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal raise ValueError(f"expected {expected} but found {value}") ValueError: expected b'XZYH' but found b'\x05\xff\xd1\xd6' [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET / HTTP/1.1" 200 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/vendor/cash.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/ankersrv.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 304 - [*] 127.0.0.1 - - [10/Apr/2023 09:39:08] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 304 - [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] [*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice] ```

chrivers commented 1 year ago

Thank you for testing - I added an experimental commit that might be the solution. Can you try again?

spuder commented 1 year ago

I caught it again, this time with debug logging running. I can send the logs out-of-band.

About 80% of the time I catch this, it happens immediately or shortly after I refresh the page. This time it was immediately after a page refresh.

[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
Exception in thread VideoQueue:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 95, in run
    self.worker_run(timeout=0.3)
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 198, in worker_run
    d = self.api.recv_xzyh(chan=1, timeout=timeout)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/ppppapi.py", line 377, in recv_xzyh
    xzyh = Xzyh.parse(hdr)[0]
           ^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/pppp.py", line 282, in parse
    magic, p = Magic.parse(p, 4, b'XZYH')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 70, in parse
    _assert_equal(v, expected)
  File "/Users/spuder/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal
    raise ValueError(f"expected {expected} but found {value}")
ValueError: expected b'XZYH' but found b'\xdf\xde\xf5F'
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET / HTTP/1.1" 200 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/vendor/cash.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/ankersrv.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [10/Apr/2023 10:31:17] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 304 -
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
[*] TOPIC [/phone/maker/AK7ZRG0C35700080/notice]
chrivers commented 1 year ago

Just an update on this. I've replayed countless logs, and also managed to reproduce it myself, after letting it run for quite a while.

Absolutely none of the replayed sessions demonstrate this problem.

I'm quite sure it has to do with a race condition between different threads. Probably one of the several Queue (or Queue-like objects) have an edge condition we are hitting.

I'll do more digging - thank you all for the detailed logs.

chrivers commented 1 year ago

good news everyone

I found the cause of the bug! Here's the function with the error:

    def rx_drw(self, index, data):
        # drop any packets we have already recieved
        if self.rx_ctr > index:
            if self.max_age_warn and (self.rx_ctr - index > self.max_age_warn):
                log.warn(f"Dropping old packet: index {index} while expecting {self.rx_ctr}.")
            return

        # record packet in queue
        self.rxqueue[index] = data

        # recombine data from queue
        while self.rx_ctr in self.rxqueue:
            del self.rxqueue[self.rx_ctr]
            self.rx_ctr += 1
            self.rx.write(data)

The last while loop recombines queued data, as much as possible. However, notice that

            self.rx.write(data)

Always writes the same chunk!

This means that any interruption in steady flow, even a small delay, or a DRW retransmission, can cause data duplication. This then leads to incorrect frame data.

This also explains why I couldn't reproduce the bug using captured data from @spuder and others: when replaying the captured data, delivery was essentially perfect.

This is that last loop, with the fix:


        # recombine data from queue
        while self.rx_ctr in self.rxqueue:
            data = self.rxqueue[self.rx_ctr]
            del self.rxqueue[self.rx_ctr]
            self.rx_ctr += 1
            self.rx.write(data)

The fix is already in the chrivers/webserver-improvements branch.

@spuder @billyjbryant can you try testing again? Now it should be smooth sailing.

spuder commented 1 year ago

🤔 I saw it happen again on commit c108525b7183acbba81a08878370c40eb1c7fdb6

[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET / HTTP/1.1" 200 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/css/bootstrap.min.css HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/img/setup/prusaslicer-1.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/ankersrv.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/vendor/jmuxer.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/vendor/bootstrap-5.3.0-alpha3-dist/js/bootstrap.bundle.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/img/setup/prusaslicer-2.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/vendor/cash.min.js HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/img/setup/prusaslicer-3.png HTTP/1.1" 304 -
[*] 127.0.0.1 - - [20/Apr/2023 12:50:43] "GET /static/img/setup/prusaslicer-4.png HTTP/1.1" 304 -
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[E] Unexpected exception while running worker
Traceback (most recent call last):
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/web/multiqueue.py", line 77, in run
    self.worker_run(timeout=0.3)
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/web/__init__.py", line 62, in worker_run
    d = self.api.recv_xzyh(chan=1, timeout=timeout)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/ppppapi.py", line 378, in recv_xzyh
    xzyh = Xzyh.parse(hdr)[0]
           ^^^^^^^^^^^^^^^
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/pppp.py", line 386, in parse
    magic, p = Magic.parse(p, 4, b'XZYH')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 70, in parse
    _assert_equal(v, expected)
  File "/Users/spude/Code/github/Ankermgmt/ankermake-m5-protocol/libflagship/amtypes.py", line 15, in _assert_equal
    raise ValueError(f"expected {expected} but found {value}")
ValueError: expected b'XZYH' but found b'\r&4N'
[W] VideoQueue: Stopping worker due to exception
[*] VideoQueue: Worked stopped
[*] VideoQueue: Starting worker
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] VideoQueue worker start
[*] Trying connect over pppp
[E] CLOSE
[E] VideoQueue: Failed to start worker: Connection rejected by device. Retrying in 1 second.
[*] VideoQueue worker start
[*] Trying connect over pppp
[*] Established pppp connection
[*] VideoQueue: Worker started
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
[*] TOPIC [/phone/maker/REDACTED/notice]
spuder commented 1 year ago

I've captured it 1 more time. It did take closer to an hour. Uploading the logs out of band.

I'll try catching it again, but this time in a bash script that should minimize the amount of excess logging.

#!/bin/bash

# Execute the ankerctl command and redirect its output to a log file
./ankerctl.py -v webserver run &> ~/Desktop/ankerctl.log &

# Get the process ID of the ankerctl command
ANKERCTL_PID=$!

# Wait for the specified string to appear in the log file
tail -f ~/Desktop/ankerctl.log | while read LINE; do
    if echo "$LINE" | grep -q "raise ValueError"; then
        # If the string is found, print the line number and kill the ankerctl process
        LINE_NUM=$(grep -n "raise ValueError" ~/Desktop/ankerctl.log | cut -d ":" -f 1)
        echo "Error found on line $LINE_NUM"
        kill $ANKERCTL_PID
        exit 1
    fi
done

Update: This script doesn't work as intended. Please ignore

chrivers commented 1 year ago

Excellent script - looking forward to seeing what it might find :)

chrivers commented 1 year ago

Okay, I found another bug - this one actually tricky.

I've pushed a fix for it, with a detailed description:

https://github.com/Ankermgmt/ankermake-m5-protocol/commit/f126aff935f3db9fb4957f2c1bacd85b576196fa

(it's in the chrivers/webserver-improvements branch)

@billyjbryant please test again :)

billyjbryant commented 1 year ago

Okay, I found another bug - this one actually tricky.

I've pushed a fix for it, with a detailed description:

https://github.com/Ankermgmt/ankermake-m5-protocol/commit/f126aff935f3db9fb4957f2c1bacd85b576196fa

(it's in the chrivers/webserver-improvements branch)

@billyjbryant please test again :)

I'll test this weekend and let you know the results

chrivers commented 1 year ago

@billyjbryant did you get a chance to test?

So far, every test from @spuder and myself has been succesful. I'm fairly confident the problem has been fixed, but I'm curious to hear from you, before I close the ticket :)

billyjbryant commented 1 year ago

@billyjbryant did you get a chance to test?

@chrivers sorry for the delay, I got to test this today and it seems to be resolved:

ankermake-m5-protocol on  chrivers/webserver-improvements [$⇕] via 🐍 v3.11.2 on ☁️  (us-west-2)
❯ python .\ankerctl.py pppp capture-video output.h264 -m 32M
[*] Trying connect over pppp
[*] Established pppp connection
 16%|██████████▏                                                      | 5.02M/32.0M [00:39<03:33, 133kb/s, size=5.02MB]

I tested multiple times and never saw the debug crash.

billyjbryant commented 1 year ago

I also tested and the webserver video feed was consistent and smooth.

chrivers commented 1 year ago

Niiice!

Fantastic... we can finally put this one to rest :tada: :partying_face:

Thanks @billyjbryant and @spuder for all the help debugging and brainstorming!