Koenkk / Z-Stack-firmware

Compilation instructions and hex files for Z-Stack firmwares
MIT License
2.38k stars 648 forks source link

Issue with ZZH on ESXi 6.7 with USB passthrough #220

Closed Dinth closed 3 years ago

Dinth commented 4 years ago

I cannot get ZZH to work on ESXi 6.7 passthrough. After inserting a flashed (on another computer) stick with the latest firmware (znp_CC26X2R1_LAUNCHXL_tirtos_ccs) I an getting: [ 316.939929] usb 2-2.2: ch341-uart converter now attached to ttyUSB0 which is fine, but already at this point when I'm trying to run znp-uart-test.py script I am getting:

./znp-uart-test.py
Got 0 bytes in response to PING command: b''
FAIL|Expected length mismatch

But the things get worse when I'm trying to run Zigbee2MQTT on this machine - Z2M fails to connect to the adapter and dmesg shows a lot of -110 errors on that interface. I also tried connecting the stick through a powered USB hub and after even more of -110 errors the hub just switched off the port (seems like some kind of failsafe). Ive got several other USB devices connected to that server (including RFlink which uses same ch341-uart driver) and USB passthrough works perfectly fine with all of them.

Zuse2k commented 3 years ago

Same here with CC26X2R1_20201026.hex - did a couple of flashes und the blinking firmware seems also ok and working. Assume the stick is ok.

Didn't try on a dedicated bare metal machine or pi so far but also not able to get it working on ESXI with pass-through to a HassOS Image and also not to a stand alone zigbee2mqtt installation on a ubuntu server vm.

[  865.554333] ch341-uart ttyUSB0: ch341-uart converter now disconnected from ttyUSB0
[  865.554370] ch341 2-2.1:1.0: device disconnected
[  880.225341] usb 2-2.1: new full-speed USB device number 5 using uhci_hcd
[  880.700928] usb 2-2.1: New USB device found, idVendor=1a86, idProduct=7523, bcdDevice= 2.64
[  880.700930] usb 2-2.1: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[  880.700931] usb 2-2.1: Product: USB Serial
[  880.725821] ch341 2-2.1:1.0: ch341-uart converter detected
[  880.744151] usb 2-2.1: ch341-uart converter now attached to ttyUSB0
[  899.825187] usb 2-2.1: failed to send control message: -110
[  901.829741] usb 2-2.1: failed to send control message: -110
[  973.532112] usb 2-2.1: failed to send control message: -110
[  995.576454] usb 2-2.1: failed to send control message: -110
[  997.581115] usb 2-2.1: failed to send control message: -110
[  999.584096] usb 2-2.1: failed to receive control message: -110
[  999.584120] ch341-uart ttyUSB0: failed to read modem status: -110
[ 1247.837307] usb 2-2.1: failed to send control message: -110
github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Dinth commented 3 years ago

@Zuse2k donypu remember what have you done? Im still struggling with this, reflashed the stick multiple times and went back to using conbee

marviins87 commented 3 years ago

I've exactly the same issue with my new ZZH stick flashed with CC26X2R1_20201026.hex.

juslex commented 3 years ago

Uses zigbee2mqtt as LXC - Linux container without usb pass through.

i don’t have anymore this issue.

For promox, use this guide:

https://www.martinellis.me/posts/2020/03/zig-a-zig-ah/

christophebonnay commented 3 years ago

Same issue here with esxi 7.0.1 (Intel NUC). HA working in a vm using docker. Anyone tried to pass-through the usb controller ?

[mer. févr. 3 00:14:26 2021] ch341-uart ttyUSB0: failed to read modem status: -110 [mer. févr. 3 00:15:39 2021] usb 2-2.1: failed to send control message: -110

marviins87 commented 3 years ago

Same issue here with esxi 7.0.1 (Intel NUC). Anyone tried to pass-through the usb controller ?

[mer. févr. 3 00:14:26 2021] ch341-uart ttyUSB0: failed to read modem status: -110 [mer. févr. 3 00:15:39 2021] usb 2-2.1: failed to send control message: -110

In my case this error was solved by disabling zigbee2mqtt in HA. As far as I understand you cannot run both zigbee2mqtt and ZHA to use the same USB controller.

Dinth commented 3 years ago

a new version of firmware has been released today, could you guys test if it helps (i have loaned my stick to a friend since i couldnt use it myself)

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Dinth commented 3 years ago

Bumping up to avoid bot

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Dinth commented 3 years ago

Bump

davidusken commented 3 years ago

After many hours of searching I finally stumbled upon this issue. I also have this exact problem, and have performed quite a bit of troubleshooting. Up until now I have been using the CC2531 which have served me well, but with the continuous Zigbee expansion in my home it was time to upgrade to CC2652R. It was recognized by ESXi as "QinHeng USB Serial" and it's showing up on the guest OS:

ls /dev/serial/by-id/
usb-1a86_USB_Serial-if00-port0

However, the system is not able to access it either through the flashing (Python script) as well as accessing it using Z2M. Also checked with the UART test script and it returns fail as well:

python3 znp-uart-test.py
Got 0 bytes in response to PING command: b''
FAIL|Expected length mismatch

Environment:

Troubleshooting:

Error messages same as mentioned above:

[ 1067.071130] usb 2-2.1: failed to receive control message: -110
[ 1870.349380] usb 2-2.1: failed to send control message: -110
[ 1872.351184] usb 2-2.1: failed to send control message: -110
[ 1874.356063] usb 2-2.1: failed to receive control message: -110

Flashed with the latest version, using Flash Programmer 2 no errors. FYI @Dinth

If you would like me to test any further please let me know, otherwise I will migrate back to the CC2531 as a temporary solution.

omerk commented 3 years ago

This is most probably not a firmware issue because of the way these sticks are put together. Borrowing this block diagram from the zzh quick start guide:

image

As far as the WMCU (and the firmware in question) is concerned, they are just speaking through UART as CC1352/CC2652 series of chips do not have USB (unlike CC2531) and rely on an external chip to do the conversion. In the case of zzh, this is the CH340.

Given that firmware has no direct control/interaction with USB at all, and the fact that testing out of the virtualised environment works just fine, this points the finger directly at ESXi. Specifically the drivers in use or perhaps the interaction between the host USB controller and the ESXi bypass mechanism. Since there are people who managed to get their setup going on ESXi, I am wondering if we are seeing some sort of ESXi host controller bypass bug type of issue here?

Dinth commented 3 years ago

Given that firmware has no direct control/interaction with USB at all, and the fact that testing out of the virtualised environment works just fine, this points the finger directly at ESXi. Specifically the drivers in use or perhaps the interaction between the host USB controller and the ESXi bypass mechanism. Since there are people who managed to get their setup going on ESXi, I am wondering if we are seeing some sort of ESXi host controller bypass bug type of issue here?

Fair point, but what's intriguing me is that other devices ive got which are using CH340/1 chip are working fine under ESXi. Also i have tried to google for issues with CH340 support on ESXi and havent found any other reports of CH340 devices not working under ESXI.

omerk commented 3 years ago

That is indeed curious and makes me think that maybe there are different versions of CH340 that work differently with the drivers ESXi provides. It's not uncommon that these chips have different revisions. Then again, given that there are working installations out there this is not a strong theory.

A bit of a tall order but wondering if there is any chance you can try and replicate this on another system that has different USB host controllers. This is essentially the USB on the motherboard of the host. Alternatively, if folks having this issue can report what they have here we can see if there is a pattern.

davidusken commented 3 years ago

Definitely sounds like an ESXi 6.7 issue yes. I think we should do some more testing on 7.0, do you still have this issue @christophebonnay? Also I found this, might be somewhat relevant:

I'm aware ESXi is not based on the Linux kernel anymore, but it might be worth to be aware of here. I have been able to confirm people used pass-through on Proxmox (kernel 5.3 atm) with no issues at all.

For testing on other systems, I have 2x Dell r210ii servers (ESXi 6.7) and will try to replicate the issue on them as well. If this does not work I might be able to get my hands on ESXi 7.0.

Edit: Also somewhat relevant https://github.com/vmware/photon/issues/1027

omerk commented 3 years ago

Another much simpler variable to check is the OS or more specifically the kernel version you are running for the guest/VM as well. This might actually be the easiest starting point.

Dinth commented 3 years ago

Debian Buster 10 kernel 4.19.0-14-amd64 Unfortunately, i dont have another host i could use to test it on a different motherboard.

davidusken commented 3 years ago

The different kernels I tried before was on HA OS (kernel 5.4.99) and Debian 10 (buster) 4.19.0-8-amd64.

Test on different host (Dell R210ii, ESXI 6.7) VM: Debian 10 (buster) 4.19.0-8-amd64

python3 -m zigpy_znp.tools.network_backup /dev/serial/by-id/usb-1a86_USB_Serial-if00-port0 -o network_backup.json
Traceback (most recent call last):
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialposix.py", line 322, in open
    self.fd = os.open(self.portstr, os.O_RDWR | os.O_NOCTTY | os.O_NONBLOCK)
OSError: [Errno 5] Input/output error: '/dev/serial/by-id/usb-1a86_USB_Serial-if00-port0'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/zigpy_znp/tools/network_backup.py", line 107, in <module>
    asyncio.run(main(sys.argv[1:]))  # pragma: no cover
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/asyncio/base_events.py", line 612, in run_until_complete
    return future.result()
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/zigpy_znp/tools/network_backup.py", line 99, in main
    backup_obj = await backup_network(
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/zigpy_znp/tools/network_backup.py", line 23, in backup_network
    await znp.connect()
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/zigpy_znp/api.py", line 219, in connect
    self._uart = await uart.connect(self._config[conf.CONF_DEVICE], self)
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/zigpy_znp/uart.py", line 158, in connect
    transport, protocol = await serial_asyncio.create_serial_connection(
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial_asyncio/__init__.py", line 445, in create_serial_connection
    serial_instance = serial.serial_for_url(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/__init__.py", line 90, in serial_for_url
    instance.open()
  File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialposix.py", line 325, in open
    raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
serial.serialutil.SerialException: [Errno 5] could not open port /dev/serial/by-id/usb-1a86_USB_Serial-if00-port0: [Errno 5] Input/output error: '/dev/serial/by-id/usb-1a86_USB_Serial-if00-port0'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "znp-uart-test.py", line 28, in ser = serial.Serial("/dev/ttyUSB0", 115200, timeout=2) File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialutil.py", line 244, in init self.open() File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialposix.py", line 325, in open raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg)) serial.serialutil.SerialException: [Errno 5] could not open port /dev/ttyUSB0: [Errno 5] Input/output error: '/dev/ttyUSB0'

- Rebooting the host to try again, same thing happens:

First it passes:

python znp-uart-test.py Got 7 bytes in response to PING command: b'\xfe\x02a\x01Y\x06=' PASS|OK


Then fails:

python znp-uart-test.py Traceback (most recent call last): File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialposix.py", line 322, in open self.fd = os.open(self.portstr, os.O_RDWR | os.O_NOCTTY | os.O_NONBLOCK) OSError: [Errno 5] Input/output error: '/dev/ttyUSB0'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "znp-uart-test.py", line 28, in ser = serial.Serial("/dev/ttyUSB0", 115200, timeout=2) File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialutil.py", line 244, in init self.open() File "/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/serial/serialposix.py", line 325, in open raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg)) serial.serialutil.SerialException: [Errno 5] could not open port /dev/ttyUSB0: [Errno 5] Input/output error: '/dev/ttyUSB0'


While this test ran the same error messages as before showed up:

[ 39.670977] usb 1-2.1: failed to send control message: -110 [ 41.674944] usb 1-2.1: failed to send control message: -110 [ 53.801916] usb 1-2.1: failed to send control message: -110 [ 55.805584] usb 1-2.1: failed to send control message: -110 [ 57.810249] usb 1-2.1: failed to receive control message: -110 [ 57.810274] ch341-uart ttyUSB0: failed to read modem status: -110

omerk commented 3 years ago

@davidusken Out of curiosity, any errors reported on the host logs?

davidusken commented 3 years ago

Surprisingly nothing, maybe @Dinth is able to see something?

Dinth commented 3 years ago

I havent found anything either in /var/log on the host, but dont take my word for it, as im not very proficient in ESXi

oschwieger commented 3 years ago

I had the same issue, ZZH on ESXi 7.0.1 USB passthrough to a Debian based iobroker installation. Either the stick wasn't recognized, or if it was, it wasn't very stable and the communication stopped after some hours. Also had the failed to read modem status messages etc. Reverted back to my old 2531 stick

zen2 commented 3 years ago

I'm following the topic even if I'm not concerned actually. Do you force USB ports to be always on on Host/Guest side ?

You can check it with: grep . /sys/bus/usb/devices/*/power/control

control content value can be "on" for 'always on' or "auto" for 'auto-suspend'. It is possible that the usb port get in suspend state and so lost power supply.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Dinth commented 3 years ago

Ehh, bad bot!

jokejoke commented 3 years ago

I've exactly the same issue with my ZZH stick flashed with latest CC2652R_coordinator_20210708.hex. Any progress? I use ESXi 6.7.

Zuse2k commented 3 years ago

@jokejoke I didn't get to manage the issue but I made a workaround with a raspberry pi. Did plug ZZH into RPi and made a ser2net socket to pipe serial usb over ethernet vlan to the vm which utilizes ZZH. Works so far stable and reliable.

jokejoke commented 3 years ago

On bare metal with Debian and HomeAssistant I have no problem. It looks like some problem only with virtualization ESXi (6.7 and 7.0.2) USB 2.0 controller. I have older HW (HP Prodesk 400 G3 mini).

luckybruce commented 3 years ago

for those on exsi, i found changing usb controller for the VM to usb3.1 controller help ch340

aztazt commented 1 year ago

Thank you so much @luckybruce your solution worked for me on esxi 7.0 !