rand256 / valetudo

Valetudo RE - experimental vacuum software, cloud free
Apache License 2.0
667 stars 73 forks source link

Unable to reach vacuum, no response for message #330

Closed signorecello closed 2 years ago

signorecello commented 3 years ago

Description

I get this message spammed on valetudo.log after a few hours:

2020-11-25T10:13:26.327Z MQTT Error : update_attributes_topic : Unable to reach vacuum, no response for message 2020-11-25T10:13:40.381Z Failed to get response for message: get_status [] { retries: 25, retriesHS: 29 }

Also on messages (at /var/log) I get this being spammed:

Nov 25 10:17:06 sun8i user.warn kernel: [58380.890079] RTW: traffic_status_watchdog(wlan0) acqiure wake_lock for 4500 ms(tx:26,rx_unicast:23) Nov 25 10:17:08 sun8i user.warn kernel: [58382.891545] RTW: hw_rate_to_m_rate(): Non supported Rate [ff]!!!

I still can ssh into it. Rebooting doesn't fix the problem unless I reset the vaccum by pressing the reset button under the lid and setting up it again.

How to Reproduce

  1. Set up valetudo (latest version, but also happened on previous versions)
  2. Set up SSH and tail the logs
  3. Wait for a few hours
  4. Check the logs

    Expected the vacuum to work normally

    Vacuum Model: S50

    Valetudo Version: 0.9.9

pidator commented 3 years ago

Nov 25 10:17:06 sun8i user.warn kernel: [58380.890079] RTW: traffic_status_watchdog(wlan0) acqiure wake_lock for 4500 ms(tx:26,rx_unicast:23) Nov 25 10:17:08 sun8i user.warn kernel: [58382.891545] RTW: hw_rate_to_m_rate(): Non supported Rate [ff]!!!

Something went wrong with either the wifi driver on the robot or your access point. This syslog message indicates a problem with the wifi transmission rate. I suppose you'll have to fix this problem first and your other connection problems will be solved, too. But this isn't related to Valetudo, it's kind of a hardware/driver issue of the linux firmware and/or your access point.

rand256 commented 3 years ago

@signorecello,

I get this message spammed on valetudo.log after a few hours

I personally don't experience this issue, but could you check whether this build would work better for you?

Also on messages (at /var/log) I get this being spammed

That is what latest stock roborock firmware does. If you have any ideas on how to fix that, please share.

pidator commented 3 years ago

That is what latest stock roborock firmware does.

😮 indeed ... ! sry, checked my wrong log files and this get's me to a wrong conclusion @signorecello

signorecello commented 3 years ago

@signorecello,

I get this message spammed on valetudo.log after a few hours

I personally don't experience this issue, but could you check whether this build would work better for you?

Also on messages (at /var/log) I get this being spammed

That is what latest stock roborock firmware does. If you have any ideas on how to fix that, please share.

Hello! I tried that build, but the robot became unresponsive and I get a timeout trying to ssh into it (or via webpage). I can ping/traceroute it, though... I can try resetting it but I'm afraid I'll lose the logs?

rand256 commented 3 years ago

I guess there will be no useful logs. I just can't imagine what could cause this issue. Haven't you set attributesUpdateInterval parameter in mqtt configuration to some ridiculously low value? If you run the device with mqtt disabled, will it work ok? How long do you experience such issues, as you wrote that it happened on the previous valetudo versions too?

Regarding current device state, if you shut down the vacuum (by removing it from the charging dock and long pressing power button) and restart it again, will it become unresponsive immediately or would it work for at least some time so you could connect to it via ssh?

signorecello commented 3 years ago

Hi! I haven't changed any mqtt settings... In the meantime I did restart the robot as you said, and it did connect to the wifi network. I'll just disable the MQTT and see what happens? Although when the robot is unresponsive, I can't access it via ssh, web, mqtt, nothing... It is definitely connected to the network though since I can ping it just fine.

This happened also on the previous version 0.9.8 and also on Hypfer's valetudo iirc... Let's say my wife acceptance factor is decaying since we can't just use the robot whenever this happens. It just stays responsive for a few hours, then mysteriously stops working

rand256 commented 3 years ago

I'll just disable the MQTT and see what happens?

I asked about mqtt since in the first post you provided a part of error logs that are generated specifically in mqtt section of code.

But now I think I'd suggest you to try flashing stock firmware to check whether it'll work there longer than a few hours. And if it fails there too, then you are likely facing some hardware issue.

signorecello commented 3 years ago

Oh, I see... Well I'll leave it with the mqtt disabled and see what happens. If it still breaks I'll go for the stock firmware... Crossing fingers since I really don't want to buy another robot...

danlink commented 3 years ago

It is definitely connected to the network though since I can ping it just fine.

I just got the same problem i guess. Turned out the "solution" was to re-enter the wifi credentials (well, the key) in valetudo. Now everything works just fine again. Funny thing is, as you said, the robot was still connecting to the wifi just fine. It was reachable via http and ssh, but MQTT was not responsive and /var/log/messages was spammed with the warning as mentioned by you. What I did previously: I changed the Networks IP-range in my router yesterday, but I have no clue of how this might be related. Pretty sure this issue has nothing to do with valeduto directly but is rather a driver/wifi-stack issue.

rand256 commented 3 years ago

@danlink,

the "solution" was to re-enter the wifi credentials (well, the key) in valetudo. Now everything works just fine again

Do you mean you have no more endless warnings in /var/log/messages?

avierck commented 3 years ago

I can also no longer communicate via mqtt after upgrading to 0.9.9 (also on S50).

/var/log/upstart/valetudo.log is full of this message: MQTT Error : mqtt_general_error : {"code":5}

I've also tried to re-enter the wifi credentials as danlink suggested above, but that did not help. Is there a way to downgrade to 0.9.8 (which was working flawlessly before)?

pidator commented 3 years ago

Do you mean you have no more endless warnings in /var/log/messages?

Are you using "auto-channel-feature" of your AP too? I'm discovering days without any warnings about the non supported rate and some other days the log is full of messages. Atm I'm trying to collect more data to get a context...

rand256 commented 3 years ago

@avierck, maybe what you're describing is related to the issue mentioned here (and definitely not to this exact topic). I've updated mqtt package and made a new build. Could you try using it to see whether it helps? See wiki on how to update.

@pidator, nope, I have a static channel 6 set on my wifi router.

avierck commented 3 years ago

@rand256 Yep, patching to this valetudo version fixed it for me. Thanks a lot!

danlink commented 3 years ago

Do you mean you have no more endless warnings in /var/log/messages?

Jep, it seems to be good for me, no more flooding. And the bot is reachable via MQTT again.

danlink commented 3 years ago

Are you using "auto-channel-feature" of your AP too?

Jep, auto-channel is activated in my AVM Fritzbox

signorecello commented 3 years ago

I'm gonna try setting a static channel on my wifi network and let you know if it worked somehow 👍

pidator commented 3 years ago

Also on messages (at /var/log) I get this being spammed

That is what latest stock roborock firmware does. If you have any ideas on how to fix that, please share.

@rand256 I was probably wrong about my thoughts of the wifi channel. Following this function of an other wifi driver with the exact same error message I conclude that the kernel log message RTW: hw_rate_to_m_rate(): Non supported Rate [ff]!!! is related with the transmission rate of the wifi config of the robot.

While looking at /opt/rockrobo/wlan/wifi_start.sh in line 82 echo "hw_mode=b" >> ${HOSTAPD_CONF} there's a setting for the wifi card to IEEE 802.11b

hw_mode=
# a = IEEE 802.11a
# b = IEEE 802.11b
# g = IEEE 802.11g

but I don't know if changing this setting will only affect the configuration while working in AP mode or also if connected to your wifi?!

I didn't found any channel or rate settings in the other script /opt/rockrobo/miio/miio_client_helper_nomqtt.sh that's extracting the information of /mnt/data/miio/wifi.conf.

signorecello commented 3 years ago

Hello, some developments on this... One of these days I was walking through my robot and noticed it was making a weird noise. Turns out the lds was spinning, probably because I had tried to start it some minutes before.

It really seems some kind of hardware issue?