kanflo / opendps

Give your DPS5005 the upgrade it deserves
MIT License
892 stars 124 forks source link

Change ESP8266 proxy to use TCP instead of UDP #139

Open m-kozlowski opened 5 years ago

m-kozlowski commented 5 years ago

Transmission to device seems to work as effects are visible on display (ping, set values, enabling output...), but receiving ends either with

Error: timeout talking to device 10.1.200.101. or Error: protocol error (-2).

I'm using "communication version" of dsp, powering esp from uart header. Running opendsp 4f227ecafdd6ecc8b1889dd4688cee899e5f082c

I have tried every combination of:

effects are always the similar:

root@nas:/usr/src/opendps/dpsctl# ./dpsctl.py  -b 9600 --ping -v
Communicating with 10.1.200.101
TX  5 bytes [7e 01 10 21 7f]
Error: timeout talking to device 10.1.200.101.
root@nas:/usr/src/opendps/dpsctl# ./dpsctl.py  -b 9600 --ping -v
Communicating with 10.1.200.101
TX  5 bytes [7e 01 10 21 7f]
RX 32 bytes [7e 84 01 3a a4 00 00 00 00 00 ff ff ff ff 00 63 76 00 76 6f 6c 74 61 67 65 00 33 30 30 30 00 63]

Error: protocol error (-2).
root@nas:/usr/src/opendps/dpsctl# ./dpsctl.py  -b 9600 --ping -v
Communicating with 10.1.200.101
TX  5 bytes [7e 01 10 21 7f]
RX 32 bytes [7e 84 01 3a 94 00 00 00 00 00 ff ff ff ff 00 63 76 00 76 6f 6c 74 61 67 65 00 33 30 30 30 00 63]

Error: protocol error (-2).
root@nas:/usr/src/opendps/dpsctl# ./dpsctl.py  -b 9600 --ping -v
Communicating with 10.1.200.101
TX  5 bytes [7e 01 10 21 7f]
RX  6 bytes [7e 81 01 38 88 7f]

Got pong from device
m-kozlowski commented 5 years ago

I'm suspecting that it's caused by UDP's lack of transmission control. Receiving replies for commands issued a while back gave me a hint. I've changed esp8266-proxy to a esp-link, changed few lines in dpsctl.py to make it work over tcp and everything started to work flawlessly.

I would be happy to investigate the exact cause of the problem, but I'm out of ideas for now:(

BTW, I'm wondering why you choose to use UDP over TCP to bridge serial connection (which is stateful and stream based. just like tcp)

kanflo commented 5 years ago

That's odd, I have never had any issues with UDP. Someday retransmission should be implemented to take care of this. I chose UDP because it resembled the connectionless serial interface (IIRC)

m-kozlowski commented 5 years ago

I think it's more about assuring correct datagram order than retransmission, but I'm still to lazy to disassemble my box, reprogram esp with esp8266-proxy firmware and capture network traffic;) IMHO TCP would be a better pick for this application as we need reliable transfers more than high bandwidth and plenty of simultaneous clients. Implementing any form of data integrity for UDP transfers would be a bit like a reinventing the wheel.

However, since it looks like I'm the only person having problems with this so far, I'm not going to push too much;)

If anyone encounters similar issue, here's my workaround that doesn't require any modifications to original code:

  1. esp is programmed with vanilla esp-link. correct baudrate can be selected via web interface->uC console. (selection is persistent across reboots)
  2. bash wrapper launches socat simulating serial device and passes args to original dpsctl.py in serial mode
    #!/bin/bash
    # make sure socat is running (single instance!) before lauching dpsctl.py
    # dps-01 is the hostname of esp-link
    run-one /usr/bin/socat pty,link=/dev/vdps0,raw,echo=0,ignoreeof  tcp:dps-01:23,keepalive,pf=ip4,connect-timeout=1  &
    DPSIF=/dev/vdps0 /home/manager/utils/dpsctl/dpsctl.py $@
kanflo commented 5 years ago

Even if you are the only person in the world seeing this failure it sure means there is something lurking here that needs fixing ;) I will update the issue title to reflect a change from UDP to TCP.

Edit: the '8266 proxy has TFTP FOTA enabled so you should not need to open your box.

m-kozlowski commented 5 years ago
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -q
Warning: sent command 04, response was 0c.

00:22:10.379850 IP (tos 0x0, ttl 64, id 23997, offset 0, flags [DF], proto UDP (17), length 33)
    nas.57075 > dps-01.waw.empeka.pl.5005: [udp sum ok] UDP, length 5
        0x0000:  4500 0021 5dbd 4000 4011 b007 0a01 0105  E..!].@.@.......
        0x0010:  0a01 1801 def3 138d 000d a2c2 7e04 4084  ............~.@.
        0x0020:  7f                                       .
00:22:10.402073 IP (tos 0x0, ttl 255, id 42, offset 0, flags [none], proto UDP (17), length 34)
    dps-01.waw.empeka.pl.5005 > nas.57075: [udp sum ok] UDP, length 6
        0x0000:  4500 0022 002a 0000 ff11 8e99 0a01 1801  E..".*..........
        0x0010:  0a01 0105 138d def3 000e 8bef 7e8c 014e  ............~..N
        0x0020:  d47f 0000 0000 0000 0000 6da6 3c5d       ..........m.<]
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -q
Error: protocol error (-2).

00:23:41.411575 IP (tos 0x0, ttl 64, id 24167, offset 0, flags [DF], proto UDP (17), length 33)
    nas.35489 > dps-01.waw.empeka.pl.5005: [udp sum ok] UDP, length 5
        0x0000:  4500 0021 5e67 4000 4011 af5d 0a01 0105  E..!^g@.@..]....
        0x0010:  0a01 1801 8aa1 138d 000d f714 7e04 4084  ............~.@.
        0x0020:  7f                                       .
00:23:41.434425 IP (tos 0x0, ttl 255, id 44, offset 0, flags [none], proto UDP (17), length 60)
    dps-01.waw.empeka.pl.5005 > nas.35489: [udp sum ok] UDP, length 32
        0x0000:  4500 003c 002c 0000 ff11 8e7d 0a01 1801  E..<.,.....}....
        0x0010:  0a01 0105 138d 8aa1 0028 1de7 7e84 018a  .........(..~...
        0x0020:  0500 0c00 0000 ffff ffff 0063 7600 766f  ...........cv.vo
        0x0030:  6c74 6167 6500 3530 3030 0063            ltage.5000.c
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -q
Error: timeout talking to device 10.1.24.1.

00:24:24.396241 IP (tos 0x0, ttl 64, id 33714, offset 0, flags [DF], proto UDP (17), length 33)
    nas.35440 > dps-01.waw.empeka.pl.5005: [udp sum ok] UDP, length 5
        0x0000:  4500 0021 83b2 4000 4011 8a12 0a01 0105  E..!..@.@.......
        0x0010:  0a01 1801 8a70 138d 000d f745 7e04 4084  .....p.....E~.@.
        0x0020:  7f                                       .

this one is even more interesting

root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -V
BootDPS GIT Hash: 4f22
OpenDPS GIT Hash: 4f22-dirty
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -q
Warning: sent command 04, response was 11.
BootDPS GIT Hash: 4f22
OpenDPS GIT Hash: 4f22-dirty

00:38:40.260233 IP (tos 0x0, ttl 64, id 26818, offset 0, flags [DF], proto UDP (17), length 33)
    nas.35701 > dps-01.waw.empeka.pl.5005: [udp sum ok] UDP, length 5
        0x0000:  4500 0021 68c2 4000 4011 a502 0a01 0105  E..!h.@.@.......
        0x0010:  0a01 1801 8b75 138d 000d 34a8 7e11 0210  .....u....4.~...
        0x0020:  7f                                       .
00:38:40.280990 IP (tos 0x0, ttl 255, id 83, offset 0, flags [none], proto UDP (17), length 51)
    dps-01.waw.empeka.pl.5005 > nas.35701: [udp sum ok] UDP, length 23
        0x0000:  4500 0033 0053 0000 ff11 8e5f 0a01 1801  E..3.S....._....
        0x0010:  0a01 0105 138d 8b75 001f 6dae 7e91 0134  .......u..m.~..4
        0x0020:  6632 3202 0034 6632 322d 6469 7274 7900  f22..4f22-dirty.
        0x0030:  468b 7f                                  F..
00:38:43.360511 IP (tos 0x0, ttl 64, id 27154, offset 0, flags [DF], proto UDP (17), length 33)
    nas.57535 > dps-01.waw.empeka.pl.5005: [udp sum ok] UDP, length 5
        0x0000:  4500 0021 6a12 4000 4011 a3b2 0a01 0105  E..!j.@.@.......
        0x0010:  0a01 1801 e0bf 138d 000d a0f6 7e04 4084  ............~.@.
        0x0020:  7f                                       .
00:38:43.379855 IP (tos 0x0, ttl 255, id 84, offset 0, flags [none], proto UDP (17), length 51)
    dps-01.waw.empeka.pl.5005 > nas.57535: [udp sum ok] UDP, length 23
        0x0000:  4500 0033 0054 0000 ff11 8e5e 0a01 1801  E..3.T.....^....
        0x0010:  0a01 0105 138d e0bf 001f 1864 7e91 0134  ...........d~..4
        0x0020:  6632 3202 0034 6632 322d 6469 7274 7900  f22..4f22-dirty.
        0x0030:  468b 7f                                  F..
m-kozlowski commented 5 years ago

next:

root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 -q
Error: protocol error (-2).
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Error: timeout talking to device 10.1.24.1.
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Error: protocol error (-2).
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Error: timeout talking to device 10.1.24.1.
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Error: protocol error (-2).
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Error: protocol error (-2).
root@nas:/usr/src/opendps# ./dpsctl/dpsctl.py -d 10.1.24.1 --ping
Got pong from device

so it looks like replies longer than 60 are truncated and somehow messing future communications. After ESP reset, commands with short replies (ping, version) are working just fine.

// 2019-04-17 I've improvised simple UDP send/receive between two hosts placed in the same network configuration as esp-proxy<->management PC (2.4G wlan<>AP<>2.4G wifi bridge<>AP<>wired eth<>pc) to check if it's maybe my unusual network configuarion somehow messes UDP transmission, but it's not. Everything worked as expected here.

iondulgheru commented 5 years ago

I also receive" Error: timeout talking to device " or "Error: protocol error (-2)". But this happens only on ESP-01 and ESP-02. Also in this case ping, version, or setting the voltage works.

On NodeMCU, with the same build, it seems that everything works. I also wanted to try on ESP-07 but my ESP-07 is not connecting to WIFI, although the flash is successful. For all I used a separate 3.3V power supply, I didn't use the power supply from the UART header as it doesn't seem to provide enough amps.

[root@localhost opendps]# python dpsctl/dpsctl.py -d 192.168.83.104 --version
BootDPS GIT Hash: a0ff
OpenDPS GIT Hash: a0ff
[root@localhost opendps]# cat /etc/redhat-release
Fedora release 29 (Twenty Nine)
[root@localhost opendps]# python --version
Python 2.7.15
[root@localhost opendps]# gcc --version
gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2)