Closed TheAssassin closed 4 years ago
Valetudo doesn't read map files from filesystem nor copies it (except for manual map store function), current map should be uploaded to valetudo's, when firmware receives from dustcloud via miio the "correct" URLs where it should upload it.
You may want to check netstat -a
from SSH, where you should see the line like this:
udp 0 0 <robot's internal IP>:<random port> 203.0.113.1:8053 ESTABLISHED
If you see this, with exactly 203.0.113.1:8053, and you have iptables rules in /etc/rc.local that DNATs 203.0.113.1 traffic back to 127.0.0.1 (and they were actually run), you should have the maps. Opening map tab on the web interface will force the firmware to reupload the map (if it is connected to dustcloud).
Sometimes manufacturer's firmware doesn't reconnect to dustcloud immediately, so it can be connected nowhere, nor to xiaomi cloud neither to dustcloud. In this case there's no way to force it to reconnect, it may take up to half an hour for it to do that. And if it happens that firmware managed to connect to the real cloud (generally it should never happen), it will keep that connection as long as it can. Then the simplest way is to reboot, or close internet access for the device.
Service hint: you may just call cat /proc/net/ip_conntrack
, that's faster and more reliable than netstat
. Also, I'd recommend using netstat -na
, because it doesn't attempt pointless name lookups.
I just reflashed 0.9.0 after breaking things while playing around. I can't see such a connection yet. iptables and the hosts file look fine, already checked that. Cleaning my test parkour just finished, I'll let it run for now to see whether it'll ever connect. If it won't connect within the next 30 minutes, I will consider rebooting.
The bot is in a separate VLAN which doesn't permit any access to the Internet.
By the way, is this widget intended to show those values?
Bot's log so far:
# tail -f /var/log/upstart/valetudo.log
Waiting for 30 sec after boot... done.
2020-02-25T08:13:21.743Z Loading configuration file: /mnt/data/valetudo/config.json
2020-02-25T08:13:21.893Z Dummycloud is spoofing 203.0.113.1:8053 on 127.0.0.1:8053
2020-02-25T08:13:21.895Z Webserver running on port 80
2020-02-25T08:13:26.844Z Got token from handshake: xyz
2020-02-25T08:13:26.870Z Probed last id = 1001 using get_status (2 retries)
2020-02-25T08:30:27.390Z Got token from handshake: xyz
I've just checked again, same logs, no change. The HTTP request to poll_map
yields {"message": "ok"}',
/api/map/latest` yields an empty response. The bot hasn't connected to the dustcloud yet.
I'm going to reboot the device now.
Is the map the only thing that's being transferred over miio? If yes, it might make sense to try to reverse that format.
is this widget intended to show those values?
It shows exactly these values when roborock software isn't connected to the dummycloud.
Is the map the only thing that's being transferred over miio?
The map is transferred via HTTP PUT, and map upload destination is set by dummycloud using a response to miio gen_presigned_url request. For this to work the device must be connected to dummycloud, and seems something prevents it on your side.
Which base firmware do you use?
I've been using the image you provide in your release section, so it's firmware 4004 resp. whatever 0.8.2 is built on. I can try building my own firmware, too.
I've rebooted through SSH by the way, and now for some reason current_status
yields 500 responses after some timeout.
It shows exactly these values when roborock software isn't connected to the dummycloud.
I think we should start documenting these diagnostic hints. How about a troubleshooting wiki page?
Edit: a request to settings.html
took about 90 seconds to be replied by a 200
response (which looks valid), current_status
is yet to be replied to.
Edit 2:
> time curl http://bot/api/current_status
Unable to reach vacuum, no response for message0.01user 0.01system 0:24.76elapsed 0%CPU (0avgtext+0avgdata 10644maxresident)k
0inputs+0outputs (0major+559minor)pagefaults 0swaps
hm, @rand256, I think I've found a possible reason for that error:
I just flashed your 4004 image of 0.9.0 for Gen1 and for me, it seems the modified /etc/hosts is missing!! Here's the output of a newly flashed Gen1 with the vacuum_valetudo_re_4004.pkg
:
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.4.39 armv7l)
* Documentation: https://help.ubuntu.com/
Last login: Tue Feb 25 19:36:50 2020 from ****
_ _ _____ ______ _ _ _ _ _____ _ _
| | | | / ___ \ / ____ \| | | || | | | / _ _ \ | \ |
| | | || / \ || / \/| | | || | | || /| |\ | |_/ |_
| | | || \___/ || | | | | || | | || || || | |\ |
| | | || ___ || | | | | || | | || || || | | | |_
\ \_/ / | | | || | | | | || | | || || || |
\ / | | | || \____/\| \___/ || \___/ || || || |
\_/ |_| |_| \______/ \_____/ \_____/ |_||_||_|
20200217
===============================================================
MODEL...........: rockrobo.vacuum.v1
SERIAL..........: ***
PRODUCTION DATE.: October 2010
FIRMWARE........: 3.5.4_004004
BUILD NUMBER....: 2019090500REL
REGION..........: de
IP..............: ***.***.***.***
MAC.............: ****
TOKEN...........: ****
DID.............: ***
KEY.............: ***
===============================================================
root@rockrobo:~# cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 rockrobo
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
So the traffic of the robot isn't directed to the dustcloud (203.0.113.1) and so you can't see an connection in netstat...
@pidator check /etc/rc.local
, it sets up some "cloud dns" thingy that is most likely some dnsmasq that enforces the redirection.
This entire iptables hackery looks bogus to me. I can't even connect to port 8053
using netcat.
I'll try another approach, using a route back to lo
for the required IP range.
This entire iptables hackery looks bogus to me. I can't even connect to port
8053
using netcat.I'll try another approach, using a route back to
lo
for the required IP range.
What are you trying to do?!
I'm proposing a better way to implement the "fake dustcloud" networking, which also allows for some debugging.
root@rockrobo:~# ip addr add 203.0.113.1/32 dev lo
root@rockrobo:~# ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet 203.0.113.1/32 scope global lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
root@rockrobo:~# ping -c4 203.0.113.1
PING 203.0.113.1 (203.0.113.1) 56(84) bytes of data.
64 bytes from 203.0.113.1: icmp_seq=1 ttl=64 time=0.217 ms
64 bytes from 203.0.113.1: icmp_seq=2 ttl=64 time=0.208 ms
64 bytes from 203.0.113.1: icmp_seq=3 ttl=64 time=0.168 ms
64 bytes from 203.0.113.1: icmp_seq=4 ttl=64 time=0.171 ms
--- 203.0.113.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2997ms
rtt min/avg/max/mdev = 0.168/0.191/0.217/0.021 ms
Now, we have the address bound to the loopback interface. The kernel will redirect all traffic for the IP 203.0.113.1
back to the device, that's ensured. In contrary to the previous solution, we don't need any complicated NAT stuff. A single command replaces three iptables commands.
I only have to change the valetudo code to listen on said IP instead of 127.0.0.1
, and I can test the dustcloud service.
root@rockrobo:~# netstat -tulpen | grep valetudo
tcp6 0 0 :::80 :::* LISTEN 0 7397 753/valetudo
udp 0 0 0.0.0.0:42063 0.0.0.0:* 0 7399 753/valetudo
udp 0 0 127.0.0.1:8053 0.0.0.0:* 0 7398 753/valetudo
I only need to figure out how to run valetudo without having to rebuild it.
Edit: oh, nice, I can just change the listen IP in the config. I hate vi
though (always do something wrong), I'll edit the file offline.
@pidator, do you have issues with a map on 4004 image?
It looks like the bot finally connected to the dustcloud UDP socket, even if just for a few seconds:
udp 17 17 src=203.0.113.1 dst=127.0.0.1 sport=8053 dport=1 [UNREPLIED] src=127.0.0.1 dst=203.0.113.1 sport=1 dport=8053 mark=0 use=2
Not entirely sure what happened, but I saw a greatly difference map view, completely grey background. Then it went back to its original state again. To me that kind of shows that the issue is a networking one and probably related to the iptables rule set. I'll keep digging.
just checked again: at the beginning it seems to be the same errors @TheAssassin has (no map, status connecting and battery 0%) and additionally this error when opening the map tab:
then I've added the hosts entries from deployment section and did a reboot. Now after ~2h of waiting everything seems to work quite normal: now the robot is creating a new map while cleaning, the status block is up to date, no more problems with my Gen1. And the host file was the only thing I've changed.
So actually cloud-dnsmasq isn't working there.
@pidator I can try that later, too. But I'm using a self built image right now, based on the older version which comes with a complete /etc/hosts
.
@pidator and @TheAssassin Can you check my firmware version ? I don't have the first generation.
Hm... now it seems DNS doesn't work, causing issues with the bot's miio client:
udp 17 19 src=127.0.0.1 dst=127.0.1.1 sport=56026 dport=53 [UNREPLIED] src=127.0.1.1 dst=127.0.0.1 sport=53 dport=56026 mark=0 use=2
udp 17 19 src=127.0.0.1 dst=127.0.1.1 sport=43278 dport=53 [UNREPLIED] src=127.0.1.1 dst=127.0.0.1 sport=53 dport=43278 mark=0 use=2
udp 17 19 src=127.0.0.1 dst=127.0.1.1 sport=52083 dport=53 [UNREPLIED] src=127.0.1.1 dst=127.0.0.1 sport=53 dport=52083 mark=0 use=2
udp 17 19 src=127.0.0.1 dst=127.0.1.1 sport=42932 dport=53 [UNREPLIED] src=127.0.1.1 dst=127.0.0.1 sport=53 dport=42932 mark=0 use=2
It is better to use drill to check dns.
Then you'll have to deal with it yourself.
@zvldz I'm watching active connections on the bot to see whether there's connections to the DNS service or the UDP socket valetudo opens. Your "tip" doesn't help at all. I do not want to check whether the DNS service works, because that is already known.
The problem here is that the bot's miio client is either not using either service or maybe is unable to connect to it. I'm using the Linux kernel's connection tracking feature called conntrack
.
udp 17 4 src=203.0.113.1 dst=127.0.0.1 sport=8053 dport=1 [UNREPLIED] src=127.0.0.1 dst=203.0.113.1 sport=1 dport=8053 mark=0 use=2
This basically means there's a connection request from the bot to the valetudo socket, but it hasn't received a reply yet. As this is a UDP connection, that doesn't have to mean that the connection isn't possible.
Good luck. Check if dnsmasq works.
Can you check my firmware version ? I don't have the first generation.
just finished a full cleaning with 4004 of valetudo re version and rebuild all my zones and spots. But I think flashing of your version should kept my data, right? I see there's a 4007 on your site too, do you have any change notes?
So actually cloud-dnsmasq isn't working there.
Is there an easy way to figure out if dnsmasq is working correctly without the host entries @rand256 ?
The command drill
supposed by @zvldz isn't available on my Gen1...
I reset my environment to use the upstream 4004 image, as it contains all the valuable debug tools such as tcpdump
. The map polling is sent to destination port 1. That's obviously wrong, nothing listens on that port. Therefore the poll request won't be answered, and the bot cannot display the map.
You can see the behavior in the line I posted earlier while still using my other image, see https://github.com/rand256/valetudo/issues/139#issuecomment-591072390. I'm attaching a pcap that contains those broken packets, so you can see it yourself (recorded with tcpdump
on the official release image).
Port 1 is, for some reason, the default value Valetudo uses for communicating with the bot:
Looking a few lines later, this behavior can be explained as follows. The dustcloud API has never seen a connection from the bot and therefore hasn't set those values.
Now, this of course doesn't fix my issue, but it confirms @rand256's explanation in https://github.com/rand256/valetudo/issues/139#issuecomment-590861785.
Does anyone have an idea how to force the bot to connect to the dustcloud API? Can we restart single services perhaps?
Does anyone have an idea how to force the bot to connect to the dustcloud API? Can we restart single services perhaps?
You may call restart rrwarchdoge
for that, it'll restart all the services.
As I understand, the only issue with my 4004 firmware image is that so-called cloud-dnsmasq
service works improperly. Its only task is to answer with 203.0.113.1 to all cloud domains requests (mi.com, xiaomi.com) and forward all other dns requests. Maybe it simply can't start by some reason. Because of that the bot never connects to dustcloud.
The dnsmasq is launched with upstart script /etc/init/dnsmasq.conf and logs to /var/log/messages. I have no upstart on 2008 firmware and can't look at it now. Maybe start on
should be changed to (started networking)
or something.
Is there an easy way to figure out if dnsmasq is working correctly
That would be ps | grep cloud-dnsmasq
for checking if it's running and nslookup asdf.mi.com
for checking if it's actually doing its job.
The dnsmasq is launched with upstart script /etc/init/dnsmasq.conf and logs to /var/log/messages.
no messages-file in /var/log on my Gen1. But this is the log of /var/log/upstart/dnsmsaq.log
root@rockrobo:/var/log/upstart# more dnsmasq.log
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
root@rockrobo:/var/log/upstart#
dnsmasq is not running:
root@rockrobo:/# ps | grep cloud-dnsmasq
root@rockrobo:/#
nslookup isn't available neither:
root@rockrobo:/# nslookup asdf.mi.com
-bash: nslookup: command not found
just finished a full cleaning with 4004 of valetudo re version and rebuild all my zones and spots. But I think flashing of your version should kept my data, right? I see there's a 4007 on your site too, do you have any change notes?
No, unfortunately I don't have a list of changes for firmware 4007. The data must not be lost during the firmware update.
Is there an easy way to figure out if dnsmasq is working correctly without the host entries @rand256 ? The command
drill
supposed by @zvldz isn't available on my Gen1.
Drill is installed in my firmware or can be installed via apt install ldnsutils
@pidator Sorry for a bit misguiding, as you see firmware with stripped down ubuntu is different to older ones. Anyway, now we see the reason: something's using that port in those "full-featured" images of gen1.
So if you simply change 5354
to something else i.e. 55354
in /etc/rc.local
and /etc/init/dnsmasq.conf
and reboot, it most likely will be fixed.
So if you simply change
5354
to something else i.e.55354
in/etc/rc.local
and/etc/init/dnsmasq.conf
and reboot, it most likely will be fixed.
unfortunately not:
root@rockrobo:~# tail -F /var/log/upstart/dnsmasq.log
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 55454: Address already in use
^C
root@rockrobo:~# ps | grep cloud-dnsmasq
root@rockrobo:~#
till now I've found no working port on a Gen1. Tried 5354, 55454, 53, and from @zvldz version 55553. The result is still the same, dnsmasq couldn't start:
root@rockrobo:~# tail -F /var/log/upstart/dnsmasq.log
dnsmasq: failed to create listening socket for port 5354: Address already in use
dnsmasq: failed to create listening socket for port 55454: Address already in use
dnsmasq: failed to create listening socket for port 55454: Address already in use
dnsmasq: failed to create listening socket for port 53: Address already in use
dnsmasq: failed to create listening socket for port 55553: Address already in use
@pidator I can try that later, too. But I'm using a self built image right now, based on the older version which comes with a complete
/etc/hosts
.
@TheAssassin Could you verify a working version by adding only the host entries to the 4004 image of @rand256 on your Gen1?
But this is the log of /var/log/upstart/dnsmsaq.log
It seems that dnsmasq is trying to run several times.
a new entry is only created after reboot!
It seems that dnsmasq is trying to run several times.
Yeah, but why then the first instance fails after taking the specified port? And doesn't release it quick enough. Weird all of this.
Could you verify a working version by adding only the host entries to the 4004 image
Most likely this will be enough for 4004 image with downgraded miio client to 3.3.9. But newer miio clients doesn't care of hosts file at all, so dnsmasq workaround was introduced exactly because of that. If only it ran correctly, as it does on 2008 firmware.
@pidator , could you please try a couple other experiments? Edit /etc/init/dnsmasq.conf
and either add to the end of exec line --bind-dynamic
, or change start on
value to net-device-up IFACE=lo
, then reboot and check whether dnsmasq is running. But that's just a guess. Maybe @TheAssassin will come up with a proper solution.
Edit
/etc/init/dnsmasq.conf
and either add to the end of exec line--bind-dynamic
no cloud-dnsmasq process visible but also no new entry in dnsmasq.log! so it seems nothing happens.
or change
start on
value tonet-device-up IFACE=lo
, then reboot and check whether dnsmasq is running.
no cloud-dnsmasq process visible and again the new entry of failed to create listening socket
for the configured port in dnsmasq.log.
On firmware 1898 I have messages 'Address already in use' too. But dnsmasq works well at the same time. Perhaps there is something wrong with the startup script.
Try to comment/delete the line 'expect fork' in the file /etc/init/dnsmasq.conf. And add -k to the exec line. Then reboot.
On firmware 1898 I have messages 'Address already in use' too. But dnsmasq works well at the same time.
that made me wonder. so here are the steps I've just done:
getent hosts asdf.xiaomi.com
working.root@rockrobo:~# cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 rockrobo
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
/etc/rc.local
and /etc/init/dnsmasq.conf
to now use port number 5355 and revert back all changes I made from the suggestions here so I'm now having the identical files like the robot is newly flashed./var/log/upstart/dnsmasq.log
is like expected:dnsmasq: failed to create listening socket for port 5355: Address already in use
root@rockrobo:~# ps | grep cloud-dnsmasq
root@rockrobo:~# ps
PID TTY TIME CMD
1573 pts/0 00:00:00 bash
15718 pts/0 00:00:00 ps
root@rockrobo:~# getent hosts asdf.xiaomi.com
203.0.113.1 asdf.xiaomi.com
root@rockrobo:~# getent hosts qwer.mi.com
203.0.113.1 qwer.mi.com
root@rockrobo:~#
So, for me it seems dnsmasq was still running all of the time on my Gen1 correctly but I havn't noticed because ps | grep cloud-dnsmasq
doesn't display anything.
I now can't explain my beginning issue after flashing 4004...
Could you show the result of the commands
root@rockrobo:~# netstat -anop | grep -E "tcp|udp"
tcp 0 0 127.0.0.1:5037 0.0.0.0:* LISTEN 426/adbd off (0.00/0/0)
tcp 0 0 127.0.0.1:54322 0.0.0.0:* LISTEN 941/miio_client off (0.00/0/0)
tcp 0 0 127.0.0.1:54323 0.0.0.0:* LISTEN 941/miio_client off (0.00/0/0)
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1470/sshd off (0.00/0/0)
tcp 0 0 0.0.0.0:199 0.0.0.0:* LISTEN 402/snmpd off (0.00/0/0)
tcp 0 0 0.0.0.0:6665 0.0.0.0:* LISTEN 866/player off (0.00/0/0)
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 498/cloud-dnsmasq off (0.00/0/0)
tcp 1 0 127.0.0.1:38271 127.0.0.1:80 CLOSE_WAIT 933/AppProxy off (0.00/0/0)
tcp 0 0 127.0.0.1:54322 127.0.0.1:44335 ESTABLISHED 941/miio_client off (0.00/0/0)
tcp 0 36 <LOCAL ROBOT LAN IP>:22 <LOCAL PC LAN IP>:50797 ESTABLISHED 31356/0 on (0.50/0/0)
tcp 0 0 127.0.0.1:44581 127.0.0.1:54322 ESTABLISHED 25553/miio_recv_lin off (0.00/0/0)
tcp 0 0 127.0.0.1:54322 127.0.0.1:44581 ESTABLISHED 941/miio_client off (0.00/0/0)
tcp 0 0 127.0.0.1:44335 127.0.0.1:54322 ESTABLISHED 933/AppProxy off (0.00/0/0)
tcp6 0 0 :::80 :::* LISTEN 741/valetudo off (0.00/0/0)
tcp6 0 0 :::22 :::* LISTEN 1470/sshd off (0.00/0/0)
tcp6 0 0 :::5355 :::* LISTEN 498/cloud-dnsmasq off (0.00/0/0)
tcp6 0 0 127.0.0.1:80 127.0.0.1:38271 FIN_WAIT2 - timewait (43.08/0/0 )
tcp6 0 0 127.0.0.1:80 127.0.0.1:38269 TIME_WAIT - timewait (38.02/0/0 )
tcp6 0 0 127.0.0.1:80 127.0.0.1:38263 TIME_WAIT - timewait (27.88/0/0 )
udp 0 0 0.0.0.0:55130 0.0.0.0:* 741/valetudo off (0.00/0/0)
udp 0 0 127.0.0.1:8053 0.0.0.0:* 741/valetudo off (0.00/0/0)
udp 0 0 <LOCAL ROBOT LAN IP>:33667 203.0.113.1:8053 ESTABLISHED 941/miio_client off (0.00/0/0)
udp 0 0 0.0.0.0:161 0.0.0.0:* 402/snmpd off (0.00/0/0)
udp 0 0 0.0.0.0:5353 0.0.0.0:* 941/miio_client off (0.00/0/0)
udp 0 0 0.0.0.0:5355 0.0.0.0:* 498/cloud-dnsmasq off (0.00/0/0)
udp 0 0 0.0.0.0:6665 0.0.0.0:* 866/player off (0.00/0/0)
udp 0 0 0.0.0.0:54321 0.0.0.0:* 941/miio_client off (0.00/0/0)
udp 0 0 0.0.0.0:48186 0.0.0.0:* 1738/dhclient off (0.00/0/0)
udp 0 0 0.0.0.0:68 0.0.0.0:* 1738/dhclient off (0.00/0/0)
udp6 0 0 :::58531 :::* 1738/dhclient off (0.00/0/0)
udp6 0 0 :::5355 :::* 498/cloud-dnsmasq off (0.00/0/0)
root@rockrobo:~# iptables -S -t nat
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-A OUTPUT -d 203.0.113.1/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 127.0.0.1:8053
-A OUTPUT -d 203.0.113.1/32 -p udp -m udp --dport 8053 -j DNAT --to-destination 127.0.0.1:8053
-A OUTPUT -p udp -m owner ! --uid-owner 65534 -m udp --dport 53 -j DNAT --to-destination 127.0.0.1:5355
-A OUTPUT -p tcp -m owner ! --uid-owner 65534 -m tcp --dport 53 -j DNAT --to-destination 127.0.0.1:5355
root@rockrobo:~#
cloud-dnsmasq working.
ps | grep cloud-dnsmasq - incorrect command
right command ps aux | grep cloud-dnsmasq
cloud-dnsmasq has been running fine, though it has been conflicting with the bot's regular dnsmasq. That's an issue, as the bot won't be able to provide DHCP if you reset the WiFi. There should be a separate init script, the init script of the regular dnsmasq should really not be misused for that.
@pidator adding the host entries makes no difference. I don't know why (yet).
It's really hard to debug these problems. Is there any chance we "guess" the port the bot will later use to connect to the API? I don't think we have to wait for it to connect, do we? Can I run valetudo directly from the scripts somehow? Or do I have to put a node runtime on the bot manually? That way I can add logging where needed and override some default ports and such stuff.
@rand256 I've rebooted using reboot
, as restart rrwarchdoge
showed me there's no service with that name. Now valetudo apparently can't connect to the bot any more, API calls to current_status
time out, yielding the "unable to connect" error. I guess I have to re-flash now, unless someone knows a way how to reset the configs...
Regarding my changes in the networking of my image, they were working just fine and are a lot less complex to use. I'll provide details in a separate issues.
Regarding the init file, you can get rid of expect fork
by simply running dnsmasq in foreground (i.e., adding --no-daemon
resp. -d
. The management through the init script doesn't work at all right now, I always have to killall
before I could restart the daemon properly through upstart. Running with -d
fixes that problem. This way, we also see logs of the daemon in the file. But, as said before, this service should be managed by a separate script.
Regarding the init file, you can get rid of
expect fork
by simply running dnsmasq in foreground (i.e., adding--no-daemon
resp.-d
.
This is debug mode. -k more suitable
-d, --no-daemon Debug mode: don't fork to the background, don't write a pid file, don't change user id, generate a complete cache dump on receipt on SIGUSR1, log to stderr as well as syslog, don't fork new processes to handle TCP queries. Note that this option is for use in debugging only, to stop dnsmasq daemonising in production, use -k.
though it has been conflicting with the bot's regular dnsmasq. That's an issue, as the bot won't be able to provide DHCP if you reset the WiFi. There should be a separate init script
It is already a separate init script, isn't it?
Regular dnsmasq is started from /opt/rockrobo/wlan/wifi_start.sh
.
It's really hard to debug these problems. Is there any chance we "guess" the port the bot will later use to connect to the API? I don't think we have to wait for it to connect, do we?
I don't really know what's going on in these gen1, I've never had a signle issue on gen2 with miio client connecting to UDP 8053 at dustcloud. And yes, we do have to wait till it decides to connect. It never was that long anyway.
showed me there's no service with that name
Since the name is rrwatchdoge
, that was an obvious typo. All init scripts can be easily found in /etc/init.
Regular dnsmasq is started from /opt/rockrobo/wlan/wifi_start.sh.
Thanks. That script is run when WiFi is reset, I assume. I don't see any calls dnsmasq, but it'll work somehow, I guess. I think we should rename the init script, though. It put me on a wrong trail. I can send a PR for that, but I don't see where it's added in this repo. I'll check @zvldz's repo later.
Since the name is rrwatchdoge, that was an obvious typo. All init scripts can be easily found in /etc/init.
Right, my bad. Sorry for the noise. I'll try that and hopefully things work again. If not, I'll reflash once again...
Resetting worked fine, bot is up and running, but no map. I'll be running tcpdump now for more than 5 minutes, let's see if that will help.
Give it some time. Have you already started a new full clean?
I just have. It's been running for ~1 hour, says uptime (not sure if I or the watchdog timer or resetting the WiFi caused the reboot, though).
Edit: I think it still has a map, otherwise it would be moving differently. It's going much faster and doesn't collide as often.
@pidator it's been running for over 4 hours, having done two full runs. I doubt it's a timing problem, really. I've fetched the traffic dumps, and am going to look into them now.
@TheAssassin your issues at the moment are still the same:
??? (sorry for the review, but I just want to make sure I haven't missed anything during the tests in this topic)
Have you seen https://github.com/rand256/valetudo/issues/96 ? @Hacki1111 described a similar situation, the problem was the wlan SSID:
Yes, there must be a problem with special characters. The ssid contains a " \ " and a " / ". Try it! ;)
An other point to check:
What's the result of the command?
cat /mnt/data/valetudo/config.json | grep map_upload
On my Gen1 which I received recently, I cannot get the maps working. Controlling the robot works pretty much as intended (although I don't really understand the manual control mode), but no matter how often I let it clean, I never get any map. I've set up a small <1m² test parkour for the bot.
Due to the lack of logging, which is the only easy way to assess Valetudo's behavior, I cannot tell what's going wrong. I've tried the builds from the release page (0.8.2 and 0.9.0), a Dustbuilder image and even one with the original Valetudo (which didn't work at all).
To me it seems like there is map data (
last_map
for instance is not an empty file), but Valetudo won't receive it via the dustcloud fake service (at least I think Valetudo doesn't read the files directly but receives something via the socket on 808x). That's also the reason for the map files not being copied touserX
. I'm not sure whether the map should've been copied to/mnt/data/valetudo/last_map
, the code isn't completely clear on that.In any case, I'd like to get into debugging this problem, really, but for someone like me who rather NOT touches node.js, some logging would be nice to have. Perhaps you can tell me where to start hacking?
P.S.: I can unfortunately not tell more information about what firmware it has after the factory reset, as
mirobo ... info
yields the infamous "vacuum not connected to cloud" error message.