Closed ThomasKaiser closed 7 years ago
@ThomasKaiser TK, I agree with you.
Slowing support & stopping later, yes sure. I am waiting for the Turris Omnia (+NAS box) to arrive & to replace the R1 (I will be keeping R1 as a "smart" lan switch).
Idea:
legacy version: ?
next version: forcing a step back to kernel 4.4.x and letting the R1 dies slowly.
dev version: mainline with a disclaimer for swconfig
This device can still be used as anything other than a router, and b53 switch driver is now a part of mainline kernel. So we don't have to drop updates for the device completely, instead building beta images once a month should be enough, and if users are interested in this device, they will have to test the beta image, and if it is reported as working, stable update should be made based on tested version.
next version: forcing a step back to kernel 4.4.x and letting the R1 dies slowly.
This will turn our build script into a mess since sunxi-next kernel branch and sunxi-next set of patches are used for many other devices, and we can't split this without breaking backwards compatibility.
I am waiting for the Turris Omnia
@golfromeo-fr I think it's not about you or your plans with R1 or any successor. It's about
But if there are no people willing to test prior to releases or unforeseen events like Dirty COW now then it's simply a matter of focusing on devices worth the effort and removing Lamobo R1 from list of automated builds.
@ThomasKaiser @zador-blood-stained I understand and agree. Thanks. I've added some tasks for XU4 (incl. Zador remarks). I will try to help here also.
I would not change the current schema because of last events.
IMO we need at least one person 'feeling' responsible for devices like R1 that are
So someone caring about this device could've take notice that there's a change with 4.8 and start early to develop a fix that can be included in Armbian long before the kernel update (in our case now needed due to Dirty COW) will be rolled out.
I believe we've a lot of users capable of doing this amongst us (see https://github.com/igorpecovnik/lib/issues/514 for example) but IMO it's necessary to improve testing/release cycles and get them on board.
I would not like to see support for this device dropped. Basically, it is just an A20 board with a b53 and some crappy rtl8192cu on the USB (which any other board could have, as well). Although I feel pretty busy on my other projects, if this is the price I have to pay to keep this device supported, then I volunteer to do some testing (hopefully just every now and then).
I was referring to not dropping the board.
IMO we need at least one person 'feeling' responsible for devices like R1 that are
Yes, that's the idea. Before we start next testing period, we have to deal with roles. I would propose to start with a simple check list. We can expand it later, on the way, when we got the system working. For now, only basic parameters. If we need something extra, we could add it to armbianmonitor. When we build images, they must (only) boot and be able to connect to network. The rest is usually fixable with update.
I was referring to not dropping the board.
I know but this can only work with a changed release/test policy since otherwise we don't get support by users testing stuff prior to release.
As for automated tests I already thought about that in the past and maybe it can work like this:
firstrun
(called with a trailing &
)/var/log/armbian-initial-tests.log
ping -c10 $(ip route show default | awk '/default/ {print $3}')
ping -c1 armbian-test.local >/dev/null 2>&1
and based on result then a bunch of other tests could be fired upIdea behind: When we want to do automated testing after image creation then all that's needed is another host in the network that registers itself through avahid
as armbian-test.local
and has iperf3 -s
running. Also the pure existence of a armbian-test.local
can be used to fire up a few more non-network tests on the board (like eg. running sysbench
).
In case firstrun
detects full test mode based on existence of this other host we could then use show_motd_warning functionality to inform user about test results. So in the end our testers have to download a new image, start it unattended, log in 10 minutes later and see whether everything's ok or not.
The list of tests can easily be extended/updated/adjusted as needed and even with boards like Orange Pi Lite or NanoPi Air we can do a fully automated test based on the assumption that we (or experienced testers) simply use another SBC (called armbian-test.local
;) ) with active AP providing a Wi-Fi network called ARMBIAN-$some-garbage-here
where $some-garbage-here
contains the passphrase in an obfuscated form. So this is just 2 lines using nmcli
to establish an unattended Wi-Fi test setup.
And no, no idea about A33-OlinuXino (but I fear I have to admit that I've not the slightest idea what to do with it anyway -- zero connectivity to the outside isn't comptabile with my use cases)
I did another attempt to upgrade to 5.23 (this time from 5.20), made sure to have swconfig installed, too. But that didn't work out. After realizing, that no b53 driver was loaded, I modprobe'd b53_common and the corresponding mdio driver. That was the result in dmesg:
[ 411.023547] libphy: mdio_driver_register: bcm53xx [ 411.024377] b53_common: found switch: BCM53125, rev 4 [ 411.024607] DSA: switch 0 0 parsed [ 411.024620] DSA: tree 0 parsed [ 411.169920] libphy: dsa slave smi: probed [ 411.266978] Generic PHY dsa-0.0:00: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:00, irq=-1) [ 411.366953] Generic PHY dsa-0.0:01: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:01, irq=-1) [ 411.466962] Generic PHY dsa-0.0:02: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:02, irq=-1) [ 411.567005] Generic PHY dsa-0.0:03: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:03, irq=-1) [ 411.667091] Generic PHY dsa-0.0:04: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:04, irq=-1) [ 411.675727] bcm53xx stmmac-0:1e: Configured port 8 for rgmii [ 413.310845] Link is Up - 1000/Full
As a result, all switch ports got separated. swconfig list however did not find any switch. Digging further into the topic and DSA, it appears that swconfig won't be necessary any more (so no need to have such dependency on lamobo kernel images), and configuration should be done with ip and brctl.
As the dmesg output shows, PHY devices are created for each external ethernet port, with aliases lan1 to lan4 and wan for convenience. Furthermore, I was able to set links up or down using ifconfig wan up
and ifconfig wan down
(same for other ports). There was also a dependency between those ports and eth0.101/eth0.102 in the sense, that the corresponding eth0.10x had to be up, before those ports could be configured. This is the dmesg log when trial-and-erroring:
[ 2687.703430] IPv6: ADDRCONF(NETDEV_UP): wan: link is not ready [ 2689.328289] bcm53xx stmmac-0:1e wan: Link is Down [ 2691.408503] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2691.408567] IPv6: ADDRCONF(NETDEV_CHANGE): wan: link becomes ready [ 2693.488812] bcm53xx stmmac-0:1e wan: Link is Down [ 2695.568792] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2697.649028] bcm53xx stmmac-0:1e wan: Link is Down [ 2698.689012] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2700.769130] bcm53xx stmmac-0:1e wan: Link is Down [ 2702.849329] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2849.529885] IPv6: ADDRCONF(NETDEV_UP): lan1: link is not ready [ 2850.619736] bcm53xx stmmac-0:1e lan1: Link is Down [ 2851.659851] bcm53xx stmmac-0:1e lan1: Link is Up - 100Mbps/Full - flow control rx/tx [ 2851.659915] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready [ 2892.222626] bcm53xx stmmac-0:1e lan1: Link is Down [ 2905.854285] IPv6: ADDRCONF(NETDEV_UP): lan2: link is not ready [ 2907.423733] bcm53xx stmmac-0:1e lan2: Link is Down [ 2908.463875] bcm53xx stmmac-0:1e lan2: Link is Up - 100Mbps/Full - flow control off [ 2908.463940] IPv6: ADDRCONF(NETDEV_CHANGE): lan2: link becomes ready [ 2921.984511] bcm53xx stmmac-0:1e lan2: Link is Down [ 2926.780949] IPv6: ADDRCONF(NETDEV_UP): lan3: link is not ready [ 2928.385226] bcm53xx stmmac-0:1e lan3: Link is Down [ 2931.505480] bcm53xx stmmac-0:1e lan3: Link is Up - 1Gbps/Full - flow control rx/tx [ 2931.505547] IPv6: ADDRCONF(NETDEV_CHANGE): lan3: link becomes ready [ 2936.705536] bcm53xx stmmac-0:1e lan3: Link is Down [ 2941.078079] IPv6: ADDRCONF(NETDEV_UP): lan4: link is not ready [ 2943.026239] bcm53xx stmmac-0:1e lan4: Link is Down [ 2944.066393] bcm53xx stmmac-0:1e lan4: Link is Up - 100Mbps/Full - flow control off [ 2944.066460] IPv6: ADDRCONF(NETDEV_CHANGE): lan4: link becomes ready [ 2989.829301] bcm53xx stmmac-0:1e lan4: Link is Down [ 3012.533573] br0: port 2(wlan0) entered disabled state [ 3012.533741] br0: port 1(eth0.102) entered disabled state [ 3035.469970] IPv6: ADDRCONF(NETDEV_UP): lan1: link is not ready [ 3036.792867] bcm53xx stmmac-0:1e lan1: Link is Down [ 3038.873087] bcm53xx stmmac-0:1e lan1: Link is Up - 100Mbps/Full - flow control rx/tx [ 3038.873154] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready [ 3199.044075] bcm53xx stmmac-0:1e lan1: Link is Down [ 3210.702451] IPv6: ADDRCONF(NETDEV_UP): lan2: link is not ready [ 3212.165292] bcm53xx stmmac-0:1e lan2: Link is Down [ 3214.245499] bcm53xx stmmac-0:1e lan2: Link is Up - 100Mbps/Full - flow control off [ 3214.245563] IPv6: ADDRCONF(NETDEV_CHANGE): lan2: link becomes ready [ 3264.168700] bcm53xx stmmac-0:1e lan2: Link is Down [ 3268.699627] IPv6: ADDRCONF(NETDEV_UP): lan3: link is not ready [ 3272.649638] bcm53xx stmmac-0:1e lan3: Link is Up - 1Gbps/Full - flow control rx/tx [ 3272.649705] IPv6: ADDRCONF(NETDEV_CHANGE): lan3: link becomes ready [ 3304.891576] bcm53xx stmmac-0:1e lan3: Link is Down [ 3309.532806] IPv6: ADDRCONF(NETDEV_UP): lan4: link is not ready [ 3311.212363] bcm53xx stmmac-0:1e lan4: Link is Up - 100Mbps/Full - flow control off [ 3311.212428] IPv6: ADDRCONF(NETDEV_CHANGE): lan4: link becomes ready
However, I assigned the wan and lan aliases a valid IP, but every ping to my devices on the corresponding port failed. After that, I went back to 5.20 and try to gather further information. One good explanation of the DSA concept can be found at http://trac.gateworks.com/wiki/linux/vlan#LinuxDistributedSwitchArchitecture. Other than that, I will look into the documentation of ip and brctl to see what useful features they hide under the hood. Until we have all the issues figured out, would there be a problem in keep building the old openwrt driver for the b53 with swconfig? That would make testing less painful, since it would just involve loading and unloading kernel modules to test some configurations.
Unless there is a testing branch and appropriate mechanisms established and volunteers are known who care for this weird switchboard (it isn't and never will be a routerboard) I still think the best thing is to drop support (implement checks in package upgrading scripts and skip the device so it remains at 4.7 until someone cares about the switchboard again).
probably swconfig is no longer needed anymore, maybe "bridge" instead http://lwn.net/Articles/634787/ Just an idea, we have experts here
i've tried to tackle this issue too, but without much success. theoretically it should 'just work', practically it doesnt. script i've used:
ip link set eth0 up brctl addbr lan for i in lan1 lan2 lan3 lan4; do ip addr flush $i ip link set $i up brctl addif lan $i done ip addr add 192.168.1.1/24 brd 192.168.1.255 dev lan ip link set lan up
and while apparently switch ports regained forwarding capability, i cant connect to/from the lamobo. something makes the packets get dropped when they are originated from the device (they show on eth0, but not on any of the lanX interfaces or lan bridge). let's see if the driver author/maintainer responds.
more funnies, leave switch unconfigured, assign local lan ip to eth0, set /proc/sys/net/ipv4/conf/eth0/{forwarding,proxy_arp} to 1, and voila! switch acts like a dumb hub where everything sees everything (even with downed lanX ports o.o). while this might not be the wanted case, if one only needs this device for local lan functionality with mainline, its nice, in a dumb way of nice
switch acts like a dumb hub where everything sees everything
This is the only mode this device should be operated since U20 (EEPROM to save switch state so the dumb switch could be brought up in a way where not each and every port is interconnected at layer 2) isn't populated on this crappy device.
Still: best idea would be to drop support entirely since currently we help users actively fooling themselves since people don't want to understand that this is not a routerboard but just a dumb switchboard.
BTW: When using the switchboard as such is GbE performance still crappy or normal A20 level (exceeding 300 Mbits/sec)?
@ThomasKaiser otoh, would be interesting if adding this eeprom could make it remember the setting
as for the speed, serving file from /tmp: /dev/null 100%[====================================>] 230.00M 52.9MB/s in 4.5s 2016-11-02 17:40:37 (51.1 MB/s) - '/dev/null' saved [241172480/241172480](receiver is bpi-m1)
Still: best idea would be to drop support entirely since currently we help users actively fooling themselves since people don't want to understand that this is not a routerboard but just a dumb switchboard.
Still, the best idea IMO is to write on the download page that network features are supported only on legacy kernel due to changes to mainline. At least until we can provide at least basic network connectivity out of the box with 4.8+ kernels. Or temporary add the old swconfig-compatible driver to the next branch and leave dev branch for DSA tests.
previous result was with bpi-m1 as a receiver, this one is using thinkpad t500: /dev/null 100%[==================================>] 230.00M 62.4MB/s in 3.8s 2016-11-02 17:43:54 (59.9 MB/s) - '/dev/null' saved [241172480/241172480]
and this one is when serving the file from bpi-m1 to thinkpad t500: /dev/null 100%[==================================>] 240.00M 25.5MB/s in 9.6s 2016-11-02 17:49:49 (25.1 MB/s) - '/dev/null' saved [251658240/251658240]
Still, the best idea IMO is to write on the download page that network features are supported only on legacy kernel due to changes to mainline.
Sure, it's simply adjusting https://github.com/igorpecovnik/lib.docs/blob/master/docs/boards/lamobo-r1.md so please feel free to add this in big red letters.
Still people will run into upgrade troubles and in case no one with a R1 around and feeling responsible for the device will join development/testing efforts the next time a kernel upgrade happens, this will repeat.
@kotc: Thanks for the numbers (though I would've preferred real measurements using iperf3
instead), I rebooted the one R1 I unfortunately bought last year at a customer yesterday (still running 3.4.108, uptime 199 days) and might switch to 4.8 immediately (since only switch mode is needed and with 4.8 currently performance doesn't seem to suck that much as with this b53
stuff with both legacy/vanilla kernels before)
@ThomasKaiser but remember, this config is somehwat hacky/invalid, also, why not 4.9? also, it's proxy_arp_pvlan, not proxy_arp
@kotc New DSA based driver was merged in 4.8, before that we had the old one from OpenWRT (swconfig-compatible)
@zador-blood-stained just wanted to know why sticking to 4.8 when 4.9 is almost baked. @ThomasKaiser if you provide me the command lines i can do them (both hosts running armbian so should be no problem)
@kotc
-rc3
means it's only around half-way through, usually there will be 6-8 release candidates. And since Armbian's "next" branch points at stable releases, linux-sunxi-next
in the repository will be 4.8.x for a while.
Or temporary add the old swconfig-compatible driver to the next branch and leave dev branch for DSA tests.
Let's do this if it's not too complicated? Addin patch back in? Anything else? Even we write things on download page ... people usually ask first than read, when things fails.
Even we write things on download page
Why should people look at the download page if they do an apt upgrade
? Seriously: the only way to prevent such upgrade hassles is to either drop support for the board or change release/testing policies and search for at least one person able/willing to test through new releases prior to pushing them out.
OK, if / when we choose to stop dealing with a board we should at least leave it in a working state. That means our last update should remove sources list or similar?
Even if I started this here (playing my provocative role as usual) I would prefer to be able to announce stuff like that since this might be the event getting users/testers/volunteers on board. But still this won't work if we don't improve regarding #512
If I understood correctly @hknaack is both willing and able to test through stuff (me unfortunately not, the R1 is serving in productive use) so if it's possible to use the old drivers with 4.8 then this might be the best short term solution as @zador-blood-stained suggested.
bpi-r1 as server, -m1 as client, both set to fixed 912MHz. [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.01 sec 792 MBytes 664 Mbits/sec 0 sender [ 4] 0.00-10.01 sec 792 MBytes 664 Mbits/sec receiver reverse: [ 4] 0.00-10.00 sec 321 MBytes 269 Mbits/sec 6194 sender [ 4] 0.00-10.00 sec 321 MBytes 269 Mbits/sec receiver
Let's do this if it's not too complicated? Addin patch back in?
Done: 077a3dd. At least compilation still works.
OK, if / when we choose to stop dealing with a board we should at least leave it in a working state. That means our last update should remove sources list or similar?
We can also put up a one time MOTD warning (in board support package) like we do with slow SD cards and docs.armbian.com link at first boot. Won't work for very old releases where board support packages don't update properly, but for newer ones it's a simple enough option.
@hknaack @kotc (possibly fixed) Lamobo R1 image NEXT will be among daily test builds - avaliable in few hours from now, build 161102.
@zador-blood-stained Tnx for updating patch. Perhaps we can define a date when support ends in a board config?
@ThomasKaiser I'll move on that discussion asap.
Perhaps we can define a date when support ends in a board config?
I don't think this is needed yet,
If you want you can modify download pages for boards that are not tested properly to include clearly visible "Limited support" notice and "Testers wanted" link to forum thread or section related to beta images or test requests/reports.
@zador-blood-stained have you checked if it works? i'm either missing some patches, dts definitions or 4.9 is incompatible with that patch. it's getting compiled but swconfig can't find the switch
@kotc
Don't have this board to test, sorry. Looks like I missed DT patch, will try to add it ASAP.
@zador-blood-stained if you drop me kernel+dts+dtb+modules tgz i can check if it boots/works here
@kotc
https://www.dropbox.com/sh/md7cfrdow6xmlxq/AACG8LAowlQNhoA817v8kC5wa?dl=0
You should be able to extract necessary files from the packages or just test them on top of an Armbian image for R1 with mainline kernel.
noope. swconfig doesnt work, might it be something changed regarding swconfig interface?
hmm. either my switch went bananas from all those experiments or something is fishy. i've just tried kernel/dtb/modules from armbian 5.17 (kernel 4.6.2-sunxi) and legacy kernel 3.4.112-sun7i. same output. i hope i didnt kill it ;)
What swconfig did you use? Please try this one: http://apt.armbian.com/pool/utils/s/swconfig/swconfig_15.04-2~armbian5.23%2B1_armhf.deb
ahm. i'm stupid. i was missing ifconfig eth0 up ;) it showed on 4.6.8 armbian kernel. not on yours though. anyway, bed time, gotta check it tomorrow in the morning
I rebuilt the kernel to switch b53 from module to built-in since old driver doesn't have DT compatible property for auto probing.
I contacted Florian Fainelli, the submitter of the new b53 driver, on how to configure a router (bridged lan1-4 with wlan, separate wan), this is his response:
Sure, so there are a few caveats since we implement DSA without a tagging protocol, but basically, what you would want to do is this:
#
Create a bridge which is mandatory for Bridge VLAN filtering to work brctl addbr br-lan brctl addif br-lan lan0 brctl addif br-lan lan1 brctl addif br-lan lan2 brctl addif br-lan lan3 brctl addif br-lan wanOnce there, you can configure different VLANs on the LAN interfaces, the default VLAN ID is 1, unless configured otherwise:
for lan in $(seq 0 3) do bridge vlan add vid 2 dev lan$lan bridge vlan del vid 1 dev lan$lan pvid untagged done
vconfig add eth0.2
and you should have now eth0 receive packets from "wan" by default, and eth0.2 receive all the LAN traffic
There are currently two limitations with DSA and B53 that I plan on addressing:
- the bridge master device: br-lan is actually our view of the switch's CPU port, but it cannot be configured in a way that the CPU would receive only tagged traffic (thus requiring eth0.1 and eth0.2)
- since we do not support Broadcom tags on b53, we cannot segregate traffic from "wan" and "lan0-3" other than by putting them in separate VLANs, but once Broadcom tags are in place "wan" alone can be used, and ports would be properly separated
@hknaack nice. but this sequence of commands doesnt do the trick. it might be missing something trivial (as setting some interface up with ip etc). but at least he is willing to communicate, thanks!
@hknaack also, thank you very much for that dts patch! now my banana works with mainline (4.9.0-rc3) and as a pseudo router! good riddance to that murky 3.4 kernel, thanks again :)
uh, of course in the last comment thanks go to @zador-blood-stained :)
@kotc Thanks for testing 😄
@igorpecovnik I tested this build <1> (extracted the filesystem image and put it on a separate partition on my SD card, then issued the appropriate boot commands in uboot console) and it was missing swconfig in the first place, so the switch didn't work out of the box. After copying swconfig and my /etc/network/interfaces from my main partition, basic networking worked (apt-get, ping). Anything else you like to get checked?
<1>http://image.armbian.com/betaimages/Armbian_5.24.161104_Lamobo-r1_Ubuntu_xenial_4.8.6.7z
This device suffers from a few fundamental problems, the most severe claiming to be useable as a router which is not the case.
For R1 users to be able to fool themselves it's necessary that an external piece of software called
swconfig
works to configure the dumb Broadcom switch which caused problems with 5.20 update and maybe now also with 5.23 (wouldn't call this a bug report since zero information has been provided to be able to even understand what might be happening).Maybe switching to kernel 4.8 with
sunxi-next
branch as part of the 5.22 update to fix Dirty COW caused again an incompatibility with thisswconfig
tool, maybe it's something else. R1 users do not care about security that much so they don't need security updates that urgently or at all.So let's drop further support for this device and stop providing updates. Unhappy users can switch to Bananian for example since less upgrades are considered a feature and not a problem for sure.