Closed 1am closed 3 years ago
Hi,
Well that looks like a network loop or duplicate MAC address. Please post output of 'ifconfig' and 'ps' commands. Also describe your network topology - where eth0 and eth1 are connected, do you use WiFi, etc.. Tcpdump capture of br-lan interface could be helpful too.
Hi,
The output is following:
# ifconfig
br-lan Link encap:Ethernet HWaddr C4:93:00:03:B0:99
inet addr:192.168.1.213 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::c693:ff:fe03:b099%2005042360/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:252 errors:0 dropped:0 overruns:0 frame:0
TX packets:264 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:14726 (14.3 KiB) TX bytes:16875 (16.4 KiB)
eth0 Link encap:Ethernet HWaddr C4:93:00:03:B0:99
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:5
eth1 Link encap:Ethernet HWaddr C4:93:00:03:B0:9A
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:252 errors:0 dropped:0 overruns:0 frame:0
TX packets:253 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18254 (17.8 KiB) TX bytes:14442 (14.1 KiB)
Interrupt:4
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1%2005044952/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:572 errors:0 dropped:0 overruns:0 frame:0
TX packets:572 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:50992 (49.7 KiB) TX bytes:50992 (49.7 KiB)
# ps
PID USER VSZ STAT COMMAND
1 root 1536 S /sbin/procd
2 root 0 SW [kthreadd]
3 root 0 SW [ksoftirqd/0]
5 root 0 SW< [kworker/0:0H]
6 root 0 SW [kworker/u2:0]
7 root 0 SW< [khelper]
8 root 0 SW [kworker/u2:1]
29 root 0 SW< [writeback]
68 root 0 SW< [crypto]
69 root 0 SW< [bioset]
71 root 0 SW< [kblockd]
103 root 0 SW [kswapd0]
104 root 0 SW [kworker/0:1]
152 root 0 SW [fsnotify_mark]
187 root 0 SW [spi0]
290 root 0 SW< [ipv6_addrconf]
296 root 0 SW< [deferwq]
299 root 0 SW< [kworker/0:1H]
335 root 0 SW [kworker/0:2]
364 root 0 SWN [jffs2_gcd_mtd5]
453 root 1180 S /sbin/ubusd
456 root 1192 S /bin/ash --login
860 root 0 SW< [cfg80211]
965 root 1180 S /sbin/logd -S 16
974 root 1444 S /sbin/rpcd
1007 root 1628 S /sbin/netifd
1019 root 1284 S /usr/sbin/odhcpd
1060 root 1048 S /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p
1074 root 1288 S /usr/sbin/uhttpd -f -h /www -r g_v03 -x /cgi-b
1083 nobody 1540 S avahi-daemon: running [gv03.local]
1091 root 1408 S /usr/bin/rsync --daemon --no-detach
1170 root 1192 S /usr/sbin/ntpd -n -S /usr/sbin/ntpd-hotplug -p 0.ope
1208 nobody 1044 S /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k -x /va
1272 root 1192 R ps
My network topology is quite simple:
Carambola2 device > switch > router with internet connection.
I don't use WiFi on Carambola2
So you have eth1 port connected to switch and eth0 left unconnected when problem happens, right? Or eth0 is also connected to switch?
Only ETH0 is physically connected to anything, in my case a switch. ETH1 is not exposed outside of Carambola2 module.
ifconfig log you sent shows that eth1 is getting packets, probably what you call ETH0 is eth1 interface in linux. You can try this network config:
cat /etc/config/network
config interface 'loopback'
option ifname 'lo'
option proto 'static'
option ipaddr '127.0.0.1'
option netmask '255.0.0.0'
config globals 'globals'
option ula_prefix 'fda4:17a0:6a81::/48'
config interface 'lan'
option type 'bridge'
option proto 'static'
option netmask '255.255.255.0'
option ifname 'eth1'
option gateway '192.168.1.2'
option ipaddr '192.168.1.213'
list dns '192.168.1.1'
If this doesn't help, try connecting Carambola2 directly to PC and see if problem persists.
Other way to debug would be to capture packets between switch and Carambola2. Since problem happens during boot capture should be done externally - by placing capture device (Carambola2 dev board or PC) with 2 bridged ETH ports between switch and Carambola2.
Hi Mantas-p
The thing is that this issue happens once in a while - for ~200 devices we've set up with the same hardware and flashed + configured the image so far only one has this issue.
This would be our second instance of the situation. It also doesn't happen repeatedly because we test the devices in two passes: first of tests if the Carambola2 is accessible overt ETH and second one is done after assembly also repeats this test along with some more. For this board first test passed and some time later second one fails. The same thing happened with the device I've written in the original post - it was working fine until it stopped and never started working again. Flashing any fresh system doesn't help.
I've also tried with the network configuration you've pasted and the results are the same.
I'm also attaching a pcap file on which you can see some brief activity from the carambola device and router and pings with no response.
Hi,
From packet capture: Carambola2 is able to send multicast packets but doesn't respond to ping requests. It would be interesting to also capture on Carambola2's eth1 port. For that you would need to have tcpdump installed in you image.
Have you tried flashing reference firmware (http://pkg.8devices.com/carambola2/v2.4/openwrt-ar71xx-generic-carambola2-squashfs-sysupgrade.bin) to defective device? Does it still print errors about receiving packets from own address? Does Ethernet work in bootloader on defective device?
We've tried flashing the reference firmware and it succeeded in bootloader once over TFTP but the same error persists on the new image. Sometimes we are able to connect to the Carambola device but after short while - around 30 seconds - it returns to the same error. We've got lucky on the attempt of flashing but later on not. A colleague sent me a link to this OpenWRT change which could maybe be related? https://dev.openwrt.org/changeset/46821
Hi,
Please send a complete boot log from serial console (including bootloader messages). I need logs from both working and defective devices to compare what might be different. Please flash the same firmware on both devices before taking logs.
OpenWRT change seems to be related to Wifi radio, so not relevant in this case.
Hi,
I'm attaching the boot logs from two devices in the same network setup. Comparing the results I see no difference except that the broken one starts throwing br-lan: received packet on eth1 with own address as source address
after boot and doesn't get an IP address.
Hi,
can you try doing this, while connected via UART/serial console:
# ifconfig eth0 down
# ifconfig eth0 hw ether 70:12:00:00:00:00
# ifconfig eth0 up
# ifconfig eth1 down
# ifconfig eth1 hw ether 70:12:00:00:00:01
# ifconfig eth1 up
Result:
# ip link
...
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000
link/ether 70:12:00:00:00:00 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 70:12:00:00:00:01 brd ff:ff:ff:ff:ff:ff
...
This will change MAC addresses on ethX interfaces. Does problem go away ?
Hi
Thank you. We've tried and ended up with the following results:
root@carambola_broken:/# /etc/init.d/network restart
[ 290.324718] device eth0 left promiscuous mode
[ 290.327690] br-lan: port 1(eth0) entered disabled state
[ 290.335903] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 290.341388] device eth1 left promiscuous mode
[ 290.344836] br-lan: port 2(eth1) entered disabled state
[ 290.367155] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
root@carambola_broken:/# [ 294.259364] device eth0 entered promiscuous mode
[ 294.281542] IPv6: ADDRCONF(NETDEV_UP): br-lan: link is not ready
[ 294.332809] device eth1 entered promiscuous mode
[ 295.775981] eth1: link up (100Mbps/Full duplex)
[ 295.779129] br-lan: port 2(eth1) entered forwarding state
[ 295.784557] br-lan: port 2(eth1) entered forwarding state
[ 295.790030] br-lan: received packet on eth1 with own address as source address
[ 295.799459] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[ 295.824922] br-lan: received packet on eth1 with own address as source address
[ 296.030087] br-lan: received packet on eth1 with own address as source address
[ 296.374711] br-lan: received packet on eth1 with own address as source address
[ 296.380656] IPv6: br-lan: IPv6 duplicate address fe80::7212:ff:fe00:0 detected!
[ 296.404745] br-lan: received packet on eth1 with own address as source address
[ 297.324735] br-lan: received packet on eth1 with own address as source address
[ 297.784508] br-lan: port 2(eth1) entered forwarding state
[ 299.040387] br-lan: received packet on eth1 with own address as source address
[ 302.051075] br-lan: received packet on eth1 with own address as source address
root@carambola_broken:/# ifconfig
br-lan Link encap:Ethernet HWaddr 70:12:00:00:00:00
inet6 addr: fe80::7212:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1382 (1.3 KiB) TX bytes:1434 (1.4 KiB)
eth0 Link encap:Ethernet HWaddr 70:12:00:00:00:00
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:5
eth1 Link encap:Ethernet HWaddr 70:12:00:00:00:01
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6640 (6.4 KiB) TX bytes:6940 (6.7 KiB)
Interrupt:4
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:176 errors:0 dropped:0 overruns:0 frame:0
TX packets:176 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12024 (11.7 KiB) TX bytes:12024 (11.7 KiB)
root@carambola_broken:/# [ 305.060730] br-lan: received packet on eth1 with own address as source address
root@carambola_broken:/# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel master br-lan state DOWN mode DEFAULT group default qlen 1000
link/ether 70:12:00:00:00:00 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br-lan state UP mode DEFAULT group default qlen 1000
link/ether 70:12:00:00:00:01 brd ff:ff:ff:ff:ff:ff
4: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether c4:93:00:04:e5:26 brd ff:ff:ff:ff:ff:ff
8: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 70:12:00:00:00:00 brd ff:ff:ff:ff:ff:ff
What is surprising is the [ 296.380656] IPv6: br-lan: IPv6 duplicate address fe80::7212:ff:fe00:0 detected!
messag. We're using IPv4 and have no IP address conflicts (at the time of writing assigned via DHCP) so I don't expect any of them in IPv6.
@1am, I have seen similar problems on Qualcomm/Atheros based devices, with damaged built-in Ethernet controller/s and/or transformers (ex. after a storm).
Cheers, Piotr
Message:
[ 296.380656] IPv6: br-lan: IPv6 duplicate address fe80::7212:ff:fe00:0
fe80::7212:ff:fe00:0 is a automatically generated local link layer address and usually it is generated based on interface MAC address, some randomness if IPv6 privacy extentions are enabled. When interface goes up kernel does DAD(duplication address detection) and thus this message ...
It seems when kernel sends DAD request on br-lan interface. br-lan consists of two ports eth0(state=down), eth1(state=up). So kernel's bridge code skips eth0 (does not send packet) and it should send on eth1 only .... and apparently the same packet is looped back on eth1 interface ...
No wander such messages are seen:
[ 295.824922] br-lan: received packet on eth1 with own address as source address
[ 296.030087] br-lan: received packet on eth1 with own address as source address
[ 296.374711] br-lan: received packet on eth1 with own address as source address
I don't understand how it can happen. It might be a faulty unit like @pepe2k said. I had never seen anything like this. Time to replace a faulty unit ?
We've seen such problems on a few of our boards, too (though not sure it had always been exactly the same symptoms). In a few cases, it was sufficient to replace the Ethernet transformer, in one case the Carambola2 board seemed to not be 100% planar: resoldering it solved the problem.
@pepe2k, @bome I've tried replacing the transformer for my ETH switch (luckily I don'thave ETH + transformer in one) but the results are the same. Carambola2 modules are also laying flat and mounted properly.
@valinskas The unit was not faulty, it worked normally and nothing bad (like storms) happened to it. In first occurrence the device was working and after a while it suddenly stopped with this error. With the latest instance of the error it was around 2-3 days of being off after initial setup before this error occurred. Carambola2 is not very easy nor cheap to replace and I'd really need to understand it's source when it happens for other devices from the current batch. So far it has been observed on 2 of few hundred Carambola2 based devices.
Hardware wise I'd say everything looks ok but for some reason the problem persists over system flashing. Is it possible that there's something funny happening in bootloader?
@1am, please contact 8devices support at: support@8devices.com to arrange replacement of defective modules. When contacting support, please send link to this ticket. Also please leave a note in shipment, saying "For Mantas P."
Ensure that your 3.3V power supply startup starts with no droop. I noticed the same error, and when I fixed my supply startup the PHY worked again.
Closing issue, ≈2years since last activity. Assuming issue was resolved.
Hi
I've built a few devices using Carambola2 devices and they were functioning correctly for around 6 months now. Since a few days one of the boards went offline and started reporting a strange error in
dmesg
and over UART connection. All of this happens after booting:I've inspected the board and found no hardware issues so far. Other devices with same configuration (including same IP address) are working. I'm sure no two devices are trying to use the same IP address as I've connected only one of them at once. Looking through internet I haven't found much leads except that it can happen when there's a loop connection on the local network... but that would not really be the case if for two devices with identical configuration one would work ok and the other not. My network configuration for all devices is the following:
And just of them stopped working while no changes were made in the network. Did someone maybe experience such issues before?