armbian / build

Armbian Linux build framework generates custom Debian or Ubuntu image for x86, aarch64, riscv64 & armhf
https://www.armbian.com
GNU General Public License v2.0
3.78k stars 2.09k forks source link

[RfE] Drop support for Lamobo R1 #511

Closed ThomasKaiser closed 7 years ago

ThomasKaiser commented 7 years ago

This device suffers from a few fundamental problems, the most severe claiming to be useable as a router which is not the case.

For R1 users to be able to fool themselves it's necessary that an external piece of software called swconfig works to configure the dumb Broadcom switch which caused problems with 5.20 update and maybe now also with 5.23 (wouldn't call this a bug report since zero information has been provided to be able to even understand what might be happening).

Maybe switching to kernel 4.8 with sunxi-next branch as part of the 5.22 update to fix Dirty COW caused again an incompatibility with this swconfig tool, maybe it's something else. R1 users do not care about security that much so they don't need security updates that urgently or at all.

So let's drop further support for this device and stop providing updates. Unhappy users can switch to Bananian for example since less upgrades are considered a feature and not a problem for sure.

golfromeo-fr commented 7 years ago

@ThomasKaiser TK, I agree with you.

Slowing support & stopping later, yes sure. I am waiting for the Turris Omnia (+NAS box) to arrive & to replace the R1 (I will be keeping R1 as a "smart" lan switch).

Idea: legacy version: ? next version: forcing a step back to kernel 4.4.x and letting the R1 dies slowly. dev version: mainline with a disclaimer for swconfig

zador-blood-stained commented 7 years ago

This device can still be used as anything other than a router, and b53 switch driver is now a part of mainline kernel. So we don't have to drop updates for the device completely, instead building beta images once a month should be enough, and if users are interested in this device, they will have to test the beta image, and if it is reported as working, stable update should be made based on tested version.

zador-blood-stained commented 7 years ago

next version: forcing a step back to kernel 4.4.x and letting the R1 dies slowly.

This will turn our build script into a mess since sunxi-next kernel branch and sunxi-next set of patches are used for many other devices, and we can't split this without breaking backwards compatibility.

ThomasKaiser commented 7 years ago

I am waiting for the Turris Omnia

@golfromeo-fr I think it's not about you or your plans with R1 or any successor. It's about

But if there are no people willing to test prior to releases or unforeseen events like Dirty COW now then it's simply a matter of focusing on devices worth the effort and removing Lamobo R1 from list of automated builds.

golfromeo-fr commented 7 years ago

@ThomasKaiser @zador-blood-stained I understand and agree. Thanks. I've added some tasks for XU4 (incl. Zador remarks). I will try to help here also.

igorpecovnik commented 7 years ago

I would not change the current schema because of last events.

ThomasKaiser commented 7 years ago

IMO we need at least one person 'feeling' responsible for devices like R1 that are

So someone caring about this device could've take notice that there's a change with 4.8 and start early to develop a fix that can be included in Armbian long before the kernel update (in our case now needed due to Dirty COW) will be rolled out.

I believe we've a lot of users capable of doing this amongst us (see https://github.com/igorpecovnik/lib/issues/514 for example) but IMO it's necessary to improve testing/release cycles and get them on board.

hknaack commented 7 years ago

I would not like to see support for this device dropped. Basically, it is just an A20 board with a b53 and some crappy rtl8192cu on the USB (which any other board could have, as well). Although I feel pretty busy on my other projects, if this is the price I have to pay to keep this device supported, then I volunteer to do some testing (hopefully just every now and then).

igorpecovnik commented 7 years ago

I was referring to not dropping the board.

IMO we need at least one person 'feeling' responsible for devices like R1 that are

Yes, that's the idea. Before we start next testing period, we have to deal with roles. I would propose to start with a simple check list. We can expand it later, on the way, when we got the system working. For now, only basic parameters. If we need something extra, we could add it to armbianmonitor. When we build images, they must (only) boot and be able to connect to network. The rest is usually fixable with update.

ThomasKaiser commented 7 years ago

I was referring to not dropping the board.

I know but this can only work with a changed release/test policy since otherwise we don't get support by users testing stuff prior to release.

As for automated tests I already thought about that in the past and maybe it can work like this:

Idea behind: When we want to do automated testing after image creation then all that's needed is another host in the network that registers itself through avahid as armbian-test.local and has iperf3 -s running. Also the pure existence of a armbian-test.local can be used to fire up a few more non-network tests on the board (like eg. running sysbench).

In case firstrun detects full test mode based on existence of this other host we could then use show_motd_warning functionality to inform user about test results. So in the end our testers have to download a new image, start it unattended, log in 10 minutes later and see whether everything's ok or not.

The list of tests can easily be extended/updated/adjusted as needed and even with boards like Orange Pi Lite or NanoPi Air we can do a fully automated test based on the assumption that we (or experienced testers) simply use another SBC (called armbian-test.local ;) ) with active AP providing a Wi-Fi network called ARMBIAN-$some-garbage-here where $some-garbage-here contains the passphrase in an obfuscated form. So this is just 2 lines using nmcli to establish an unattended Wi-Fi test setup.

And no, no idea about A33-OlinuXino (but I fear I have to admit that I've not the slightest idea what to do with it anyway -- zero connectivity to the outside isn't comptabile with my use cases)

hknaack commented 7 years ago

I did another attempt to upgrade to 5.23 (this time from 5.20), made sure to have swconfig installed, too. But that didn't work out. After realizing, that no b53 driver was loaded, I modprobe'd b53_common and the corresponding mdio driver. That was the result in dmesg:

[ 411.023547] libphy: mdio_driver_register: bcm53xx [ 411.024377] b53_common: found switch: BCM53125, rev 4 [ 411.024607] DSA: switch 0 0 parsed [ 411.024620] DSA: tree 0 parsed [ 411.169920] libphy: dsa slave smi: probed [ 411.266978] Generic PHY dsa-0.0:00: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:00, irq=-1) [ 411.366953] Generic PHY dsa-0.0:01: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:01, irq=-1) [ 411.466962] Generic PHY dsa-0.0:02: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:02, irq=-1) [ 411.567005] Generic PHY dsa-0.0:03: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:03, irq=-1) [ 411.667091] Generic PHY dsa-0.0:04: attached PHY driver [Generic PHY](mii_bus:phy_addr=dsa-0.0:04, irq=-1) [ 411.675727] bcm53xx stmmac-0:1e: Configured port 8 for rgmii [ 413.310845] Link is Up - 1000/Full

As a result, all switch ports got separated. swconfig list however did not find any switch. Digging further into the topic and DSA, it appears that swconfig won't be necessary any more (so no need to have such dependency on lamobo kernel images), and configuration should be done with ip and brctl. As the dmesg output shows, PHY devices are created for each external ethernet port, with aliases lan1 to lan4 and wan for convenience. Furthermore, I was able to set links up or down using ifconfig wan up and ifconfig wan down (same for other ports). There was also a dependency between those ports and eth0.101/eth0.102 in the sense, that the corresponding eth0.10x had to be up, before those ports could be configured. This is the dmesg log when trial-and-erroring:

[ 2687.703430] IPv6: ADDRCONF(NETDEV_UP): wan: link is not ready [ 2689.328289] bcm53xx stmmac-0:1e wan: Link is Down [ 2691.408503] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2691.408567] IPv6: ADDRCONF(NETDEV_CHANGE): wan: link becomes ready [ 2693.488812] bcm53xx stmmac-0:1e wan: Link is Down [ 2695.568792] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2697.649028] bcm53xx stmmac-0:1e wan: Link is Down [ 2698.689012] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2700.769130] bcm53xx stmmac-0:1e wan: Link is Down [ 2702.849329] bcm53xx stmmac-0:1e wan: Link is Up - 100Mbps/Full - flow control off [ 2849.529885] IPv6: ADDRCONF(NETDEV_UP): lan1: link is not ready [ 2850.619736] bcm53xx stmmac-0:1e lan1: Link is Down [ 2851.659851] bcm53xx stmmac-0:1e lan1: Link is Up - 100Mbps/Full - flow control rx/tx [ 2851.659915] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready [ 2892.222626] bcm53xx stmmac-0:1e lan1: Link is Down [ 2905.854285] IPv6: ADDRCONF(NETDEV_UP): lan2: link is not ready [ 2907.423733] bcm53xx stmmac-0:1e lan2: Link is Down [ 2908.463875] bcm53xx stmmac-0:1e lan2: Link is Up - 100Mbps/Full - flow control off [ 2908.463940] IPv6: ADDRCONF(NETDEV_CHANGE): lan2: link becomes ready [ 2921.984511] bcm53xx stmmac-0:1e lan2: Link is Down [ 2926.780949] IPv6: ADDRCONF(NETDEV_UP): lan3: link is not ready [ 2928.385226] bcm53xx stmmac-0:1e lan3: Link is Down [ 2931.505480] bcm53xx stmmac-0:1e lan3: Link is Up - 1Gbps/Full - flow control rx/tx [ 2931.505547] IPv6: ADDRCONF(NETDEV_CHANGE): lan3: link becomes ready [ 2936.705536] bcm53xx stmmac-0:1e lan3: Link is Down [ 2941.078079] IPv6: ADDRCONF(NETDEV_UP): lan4: link is not ready [ 2943.026239] bcm53xx stmmac-0:1e lan4: Link is Down [ 2944.066393] bcm53xx stmmac-0:1e lan4: Link is Up - 100Mbps/Full - flow control off [ 2944.066460] IPv6: ADDRCONF(NETDEV_CHANGE): lan4: link becomes ready [ 2989.829301] bcm53xx stmmac-0:1e lan4: Link is Down [ 3012.533573] br0: port 2(wlan0) entered disabled state [ 3012.533741] br0: port 1(eth0.102) entered disabled state [ 3035.469970] IPv6: ADDRCONF(NETDEV_UP): lan1: link is not ready [ 3036.792867] bcm53xx stmmac-0:1e lan1: Link is Down [ 3038.873087] bcm53xx stmmac-0:1e lan1: Link is Up - 100Mbps/Full - flow control rx/tx [ 3038.873154] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready [ 3199.044075] bcm53xx stmmac-0:1e lan1: Link is Down [ 3210.702451] IPv6: ADDRCONF(NETDEV_UP): lan2: link is not ready [ 3212.165292] bcm53xx stmmac-0:1e lan2: Link is Down [ 3214.245499] bcm53xx stmmac-0:1e lan2: Link is Up - 100Mbps/Full - flow control off [ 3214.245563] IPv6: ADDRCONF(NETDEV_CHANGE): lan2: link becomes ready [ 3264.168700] bcm53xx stmmac-0:1e lan2: Link is Down [ 3268.699627] IPv6: ADDRCONF(NETDEV_UP): lan3: link is not ready [ 3272.649638] bcm53xx stmmac-0:1e lan3: Link is Up - 1Gbps/Full - flow control rx/tx [ 3272.649705] IPv6: ADDRCONF(NETDEV_CHANGE): lan3: link becomes ready [ 3304.891576] bcm53xx stmmac-0:1e lan3: Link is Down [ 3309.532806] IPv6: ADDRCONF(NETDEV_UP): lan4: link is not ready [ 3311.212363] bcm53xx stmmac-0:1e lan4: Link is Up - 100Mbps/Full - flow control off [ 3311.212428] IPv6: ADDRCONF(NETDEV_CHANGE): lan4: link becomes ready

However, I assigned the wan and lan aliases a valid IP, but every ping to my devices on the corresponding port failed. After that, I went back to 5.20 and try to gather further information. One good explanation of the DSA concept can be found at http://trac.gateworks.com/wiki/linux/vlan#LinuxDistributedSwitchArchitecture. Other than that, I will look into the documentation of ip and brctl to see what useful features they hide under the hood. Until we have all the issues figured out, would there be a problem in keep building the old openwrt driver for the b53 with swconfig? That would make testing less painful, since it would just involve loading and unloading kernel modules to test some configurations.

ThomasKaiser commented 7 years ago

Unless there is a testing branch and appropriate mechanisms established and volunteers are known who care for this weird switchboard (it isn't and never will be a routerboard) I still think the best thing is to drop support (implement checks in package upgrading scripts and skip the device so it remains at 4.7 until someone cares about the switchboard again).

golfromeo-fr commented 7 years ago

probably swconfig is no longer needed anymore, maybe "bridge" instead http://lwn.net/Articles/634787/ Just an idea, we have experts here

kotc commented 7 years ago

i've tried to tackle this issue too, but without much success. theoretically it should 'just work', practically it doesnt. script i've used:

ip link set eth0 up brctl addbr lan for i in lan1 lan2 lan3 lan4; do ip addr flush $i ip link set $i up brctl addif lan $i done ip addr add 192.168.1.1/24 brd 192.168.1.255 dev lan ip link set lan up

and while apparently switch ports regained forwarding capability, i cant connect to/from the lamobo. something makes the packets get dropped when they are originated from the device (they show on eth0, but not on any of the lanX interfaces or lan bridge). let's see if the driver author/maintainer responds.

kotc commented 7 years ago

more funnies, leave switch unconfigured, assign local lan ip to eth0, set /proc/sys/net/ipv4/conf/eth0/{forwarding,proxy_arp} to 1, and voila! switch acts like a dumb hub where everything sees everything (even with downed lanX ports o.o). while this might not be the wanted case, if one only needs this device for local lan functionality with mainline, its nice, in a dumb way of nice

ThomasKaiser commented 7 years ago

switch acts like a dumb hub where everything sees everything

This is the only mode this device should be operated since U20 (EEPROM to save switch state so the dumb switch could be brought up in a way where not each and every port is interconnected at layer 2) isn't populated on this crappy device.

Still: best idea would be to drop support entirely since currently we help users actively fooling themselves since people don't want to understand that this is not a routerboard but just a dumb switchboard.

BTW: When using the switchboard as such is GbE performance still crappy or normal A20 level (exceeding 300 Mbits/sec)?

kotc commented 7 years ago

@ThomasKaiser otoh, would be interesting if adding this eeprom could make it remember the setting

kotc commented 7 years ago

as for the speed, serving file from /tmp: /dev/null 100%[====================================>] 230.00M 52.9MB/s in 4.5s 2016-11-02 17:40:37 (51.1 MB/s) - '/dev/null' saved [241172480/241172480](receiver is bpi-m1)

zador-blood-stained commented 7 years ago

Still: best idea would be to drop support entirely since currently we help users actively fooling themselves since people don't want to understand that this is not a routerboard but just a dumb switchboard.

Still, the best idea IMO is to write on the download page that network features are supported only on legacy kernel due to changes to mainline. At least until we can provide at least basic network connectivity out of the box with 4.8+ kernels. Or temporary add the old swconfig-compatible driver to the next branch and leave dev branch for DSA tests.

kotc commented 7 years ago

previous result was with bpi-m1 as a receiver, this one is using thinkpad t500: /dev/null 100%[==================================>] 230.00M 62.4MB/s in 3.8s 2016-11-02 17:43:54 (59.9 MB/s) - '/dev/null' saved [241172480/241172480]

kotc commented 7 years ago

and this one is when serving the file from bpi-m1 to thinkpad t500: /dev/null 100%[==================================>] 240.00M 25.5MB/s in 9.6s 2016-11-02 17:49:49 (25.1 MB/s) - '/dev/null' saved [251658240/251658240]

ThomasKaiser commented 7 years ago

Still, the best idea IMO is to write on the download page that network features are supported only on legacy kernel due to changes to mainline.

Sure, it's simply adjusting https://github.com/igorpecovnik/lib.docs/blob/master/docs/boards/lamobo-r1.md so please feel free to add this in big red letters.

Still people will run into upgrade troubles and in case no one with a R1 around and feeling responsible for the device will join development/testing efforts the next time a kernel upgrade happens, this will repeat.

@kotc: Thanks for the numbers (though I would've preferred real measurements using iperf3 instead), I rebooted the one R1 I unfortunately bought last year at a customer yesterday (still running 3.4.108, uptime 199 days) and might switch to 4.8 immediately (since only switch mode is needed and with 4.8 currently performance doesn't seem to suck that much as with this b53 stuff with both legacy/vanilla kernels before)

kotc commented 7 years ago

@ThomasKaiser but remember, this config is somehwat hacky/invalid, also, why not 4.9? also, it's proxy_arp_pvlan, not proxy_arp

zador-blood-stained commented 7 years ago

@kotc New DSA based driver was merged in 4.8, before that we had the old one from OpenWRT (swconfig-compatible)

kotc commented 7 years ago

@zador-blood-stained just wanted to know why sticking to 4.8 when 4.9 is almost baked. @ThomasKaiser if you provide me the command lines i can do them (both hosts running armbian so should be no problem)

zador-blood-stained commented 7 years ago

@kotc -rc3 means it's only around half-way through, usually there will be 6-8 release candidates. And since Armbian's "next" branch points at stable releases, linux-sunxi-next in the repository will be 4.8.x for a while.

igorpecovnik commented 7 years ago

Or temporary add the old swconfig-compatible driver to the next branch and leave dev branch for DSA tests.

Let's do this if it's not too complicated? Addin patch back in? Anything else? Even we write things on download page ... people usually ask first than read, when things fails.

ThomasKaiser commented 7 years ago

Even we write things on download page

Why should people look at the download page if they do an apt upgrade? Seriously: the only way to prevent such upgrade hassles is to either drop support for the board or change release/testing policies and search for at least one person able/willing to test through new releases prior to pushing them out.

igorpecovnik commented 7 years ago

OK, if / when we choose to stop dealing with a board we should at least leave it in a working state. That means our last update should remove sources list or similar?

ThomasKaiser commented 7 years ago

Even if I started this here (playing my provocative role as usual) I would prefer to be able to announce stuff like that since this might be the event getting users/testers/volunteers on board. But still this won't work if we don't improve regarding #512

If I understood correctly @hknaack is both willing and able to test through stuff (me unfortunately not, the R1 is serving in productive use) so if it's possible to use the old drivers with 4.8 then this might be the best short term solution as @zador-blood-stained suggested.

kotc commented 7 years ago

bpi-r1 as server, -m1 as client, both set to fixed 912MHz. [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.01 sec 792 MBytes 664 Mbits/sec 0 sender [ 4] 0.00-10.01 sec 792 MBytes 664 Mbits/sec receiver reverse: [ 4] 0.00-10.00 sec 321 MBytes 269 Mbits/sec 6194 sender [ 4] 0.00-10.00 sec 321 MBytes 269 Mbits/sec receiver

zador-blood-stained commented 7 years ago

Let's do this if it's not too complicated? Addin patch back in?

Done: 077a3dd. At least compilation still works.

zador-blood-stained commented 7 years ago

OK, if / when we choose to stop dealing with a board we should at least leave it in a working state. That means our last update should remove sources list or similar?

We can also put up a one time MOTD warning (in board support package) like we do with slow SD cards and docs.armbian.com link at first boot. Won't work for very old releases where board support packages don't update properly, but for newer ones it's a simple enough option.

igorpecovnik commented 7 years ago

@hknaack @kotc (possibly fixed) Lamobo R1 image NEXT will be among daily test builds - avaliable in few hours from now, build 161102.

@zador-blood-stained Tnx for updating patch. Perhaps we can define a date when support ends in a board config?

@ThomasKaiser I'll move on that discussion asap.

zador-blood-stained commented 7 years ago

Perhaps we can define a date when support ends in a board config?

I don't think this is needed yet,

If you want you can modify download pages for boards that are not tested properly to include clearly visible "Limited support" notice and "Testers wanted" link to forum thread or section related to beta images or test requests/reports.

kotc commented 7 years ago

@zador-blood-stained have you checked if it works? i'm either missing some patches, dts definitions or 4.9 is incompatible with that patch. it's getting compiled but swconfig can't find the switch

zador-blood-stained commented 7 years ago

@kotc

Don't have this board to test, sorry. Looks like I missed DT patch, will try to add it ASAP.

kotc commented 7 years ago

@zador-blood-stained if you drop me kernel+dts+dtb+modules tgz i can check if it boots/works here

zador-blood-stained commented 7 years ago

@kotc

https://www.dropbox.com/sh/md7cfrdow6xmlxq/AACG8LAowlQNhoA817v8kC5wa?dl=0

You should be able to extract necessary files from the packages or just test them on top of an Armbian image for R1 with mainline kernel.

kotc commented 7 years ago

noope. swconfig doesnt work, might it be something changed regarding swconfig interface?

kotc commented 7 years ago

hmm. either my switch went bananas from all those experiments or something is fishy. i've just tried kernel/dtb/modules from armbian 5.17 (kernel 4.6.2-sunxi) and legacy kernel 3.4.112-sun7i. same output. i hope i didnt kill it ;)

zador-blood-stained commented 7 years ago

What swconfig did you use? Please try this one: http://apt.armbian.com/pool/utils/s/swconfig/swconfig_15.04-2~armbian5.23%2B1_armhf.deb

kotc commented 7 years ago

ahm. i'm stupid. i was missing ifconfig eth0 up ;) it showed on 4.6.8 armbian kernel. not on yours though. anyway, bed time, gotta check it tomorrow in the morning

zador-blood-stained commented 7 years ago

I rebuilt the kernel to switch b53 from module to built-in since old driver doesn't have DT compatible property for auto probing.

hknaack commented 7 years ago

I contacted Florian Fainelli, the submitter of the new b53 driver, on how to configure a router (bridged lan1-4 with wlan, separate wan), this is his response:

Sure, so there are a few caveats since we implement DSA without a tagging protocol, but basically, what you would want to do is this:

# Create a bridge which is mandatory for Bridge VLAN filtering to work brctl addbr br-lan brctl addif br-lan lan0 brctl addif br-lan lan1 brctl addif br-lan lan2 brctl addif br-lan lan3 brctl addif br-lan wan

Once there, you can configure different VLANs on the LAN interfaces, the default VLAN ID is 1, unless configured otherwise:

for lan in $(seq 0 3) do bridge vlan add vid 2 dev lan$lan bridge vlan del vid 1 dev lan$lan pvid untagged done

vconfig add eth0.2

and you should have now eth0 receive packets from "wan" by default, and eth0.2 receive all the LAN traffic

There are currently two limitations with DSA and B53 that I plan on addressing:

  • the bridge master device: br-lan is actually our view of the switch's CPU port, but it cannot be configured in a way that the CPU would receive only tagged traffic (thus requiring eth0.1 and eth0.2)
  • since we do not support Broadcom tags on b53, we cannot segregate traffic from "wan" and "lan0-3" other than by putting them in separate VLANs, but once Broadcom tags are in place "wan" alone can be used, and ports would be properly separated
kotc commented 7 years ago

@hknaack nice. but this sequence of commands doesnt do the trick. it might be missing something trivial (as setting some interface up with ip etc). but at least he is willing to communicate, thanks!

kotc commented 7 years ago

@hknaack also, thank you very much for that dts patch! now my banana works with mainline (4.9.0-rc3) and as a pseudo router! good riddance to that murky 3.4 kernel, thanks again :)

kotc commented 7 years ago

uh, of course in the last comment thanks go to @zador-blood-stained :)

zador-blood-stained commented 7 years ago

@kotc Thanks for testing 😄

hknaack commented 7 years ago

@igorpecovnik I tested this build <1> (extracted the filesystem image and put it on a separate partition on my SD card, then issued the appropriate boot commands in uboot console) and it was missing swconfig in the first place, so the switch didn't work out of the box. After copying swconfig and my /etc/network/interfaces from my main partition, basic networking worked (apt-get, ping). Anything else you like to get checked?

<1>http://image.armbian.com/betaimages/Armbian_5.24.161104_Lamobo-r1_Ubuntu_xenial_4.8.6.7z