Quallenauge / Easybox-904-XDSL

Fork of openwrt with vendor specific changes from open sourced firmware 3.10.
GNU General Public License v2.0
20 stars 8 forks source link

Consolidate target image building #12

Closed arnysch closed 5 years ago

arnysch commented 5 years ago

With only small changes, the standard target/linux/lantiq/image/Makefile is sufficient. No eb904.mk is needed.

Also abandon building an "ubinized" image ("squashfs-rootfs-ubinized.bin"), because sysupgrade knows to create ubi volumes from the kernel and rootfs parts within a squashfs-sysupgrade.bin file.

Quallenauge commented 5 years ago

Cool! Are you able to re-create an fullimage.img with this, which can be used for recovery?

arnysch commented 5 years ago

Your question makes me ask:

BTW I just realized that fullimage.img as created by code from this pull request still tries to use an UBI rootfs and it fails for me. This is corrected in my uploaded images, which use mtd1 ("rootfs") instead of UBI and do not handle bootnum/bootid.

.

Quallenauge commented 5 years ago

I mean, the first option for most users to flash an recovery image, which let's you flash the real openwrt image. This first fullimage.img has to be of one specific format to be accepted by the stock uboot. I created a special tag ( https://github.com/Quallenauge/Easybox-904-XDSL/releases/tag/Easybox-904-XDSL-Recovery-1-0-1 ) which includes some changes to produce this kind of image. Aside from that, I don't have some preferences with the format of the image, if it can read by openwrt installations I'm fine with it. -> If we find a way to handle both cases, that would be cool!

I think the existing UBI rootfs is more streamlined, isn't it?

arnysch commented 5 years ago

Adjusted settings so partition mtd12 is used for UBI and for a rootfs ubi volume instead of mtd1.

A working fullimage.img == initramfs is placed here. (of course built using the new way, i.e. without using target/linux/lantiq/image/eb904.mk)

fullimage.img / initramfs content: LuCI stuff is included in order to provide a sysupgrade frontend; USB storage, VFAT, F2FS included DSL/ppp/vpe stuff has been omitted in order to keep size smaller than 0x500000 bytes.

If the eb904 contains the original vendor U-Boot or an older uboot-lantiq-easybox904xdsl, the initramfs can be flashed to mtd2 via the reset-press-during-power-on recovery procedure, and be booted afterwards. The reason why this works is because the initramfs now always contains a small dummy rootfs at the end.

If the eb904 contains my recent uboot-lantiq-easybox904xdsl, the initramfs can be loaded and directly started by the reset-press-during-power-on recovery procedure. Nothing is flashed.

NAND sub-pages are enabled, because according to my tests, and according to nandsubpagetest from the mtd utils this works ok. We should switch to sub-blocks rather earlier than later.

If an old UBI partition exists (VID header offset 2048 instead of 512), the kernel will log ugly messages when trying to attach it. This is no problem for an initramfs system, i.e. when the current system does not use a rootfs on the UBI partition.

When sysupgrading and there exists a partition called 'ubi', sysupgrade will install a rootfs volume in the ubi partition. If necessary, sysupgrade will reformat the ubi partition for a sub-page size of 512 bytes.

On my eb904, I have tested the sysupgrade process, with both a pre-existing old (non sub-page) UBI partition and also a new (512 byte sup-page) UBI partition.

Quallenauge commented 5 years ago

Wow! That's impressive. Thanks for developing and testing the whole thing and also the corner cases! The changes looks well, so I think if either I or someone else can test this I would like merge this, if you don't mind.

arnysch commented 5 years ago

Yes, getting merged in your repositry is my intention, as indicated by a pull request.

But please still wait for a moment. We need to think again, because: All my tests were done with small systems (ca 5MB). Yesterday afternoon I incorporated a lot of other stuff (DSL/ppp, LCD, tools), so size increased to 10MB. Sysupgrading seemed to work flawless. But: I get NAND errors when reading certain blocks of the flashed rootfs.

Running the initramfs-version of this bigger system works perfect. I used this ram-only version for further tests, described below. Thanks to my U-Boot changes it can easily be started via reset-button-press-during-power-on :-).

Retried flashing -> Still problems. ubiformat-reformatted mtd12, manually created rootfs volume -> Still problems

Is it the 512B sub-page which does not work?

I changed back to 2048 bytes (changed boot param, /lib/upgrade/platform.sh), rebuilt. ubiformat-ted mtd12 with -s 2048, manually created rootfs -> Still problems.

Erased mtd12 with U-Boot (well, I know, UBI erase counters get lost when I do this), ubiformat-ted mtd12 with -s 2048, manually created rootfs -> Still problems.

Aaargh! Big frustration

Observations:

It looks like writing is not safe. A verify-after-write seems to be necessary. Isn't UBI doing this? Looks like it isn't.

Error is reported when:

Typical error msg from UBI in syslog when running dd if=/dev/ubi0_0 of=/dev/null

ubi0 error: ubi_io_read: error -77 (ECC error) while reading 61440 bytes from PEB 521:69632, read 61440 bytes

Typical error message written to stderr by nanddump /dev/mtd12 >/dev/null

ECC: 6 uncorrectable bitflip(s) at offset 0x04131000

How to handle this? Don't know yet. More questions:

kovz commented 5 years ago

@arnysch Basing on datasheet for K9F4G08U0D I think there is no sub-page support in hardware, because minimal read I/O size is page. Then I found this discussion, that confirmed my assumptions. Maybe there are different HW versions with different NAND chips or UBI test for sub-pages doesn't work correctly?

arnysch commented 5 years ago

Hi kovz, the data sheet you refer to is about a "D-die" chip and tells in section 2.8. and 5.2.: " it does allow multiple partial page programming". Up to 4 partial writes (which must be consecutive, i.e. starting with lower addresses ) are allowed. I assume that this is the feature used for sub-paging, as suggested here. . The discussion you refer to is about the K9F1G08U0E ("Samsung E-die") which indeed allows only one partial program cycle as revealed in section 2.8.

Anyway, just to be sure, for my later tests I disabled subpages. Erased the partitition and re-ubiformatted. Still get these stupid ECC errors.

Either my flash is worn out (but I don't think it was written very often) or kaputt right from the beginning. Or might it be s.th. else?

Would be good to know if anyone else has/has not the same problems with mtd12 on the eb904.

Quallenauge commented 5 years ago

I had some problems with the ubi some months ago. I thought it is the result with some random reboots I had in the past. Booting was not possible anymore due weird ubi errors. After that I played around with nandtest, flash_erase and so on (Sadly I'm not able remember correctly what I did :-/ ). Only after that it seems I was able to reflash the firmware which re-creates the ubi partition.

Are there some bad blocks visible?

kovz commented 5 years ago

Ok. Got it. I will try to do the same tests. Probably will understand more about UBI and somehow help you.

arnysch commented 5 years ago

@Qauge: "Are there some bad blocks visible?": within mtd12 there is one PEB identified as bad. It is listed in the BBT. This is PEB no. 1462 (address 0xB6C0000=1462 x 1024 x 128). Looks like this is not a problem, the PEB is simply ignored/skipped/not used. After erasing, nanddump finds no errors. Only after ubiformatting, nandump reports errors. They do not relate to the bad block mentioned above, as seen in my log. Repeated usage of nandump reports the same error locations. When erasing and re-ubiformatting, there are errors again. But on other places. So I think the problem must occur while writing to the flash. @kozv: Would be great to hear about your test results.

arnysch commented 5 years ago

FYI: run many more tests. I took control at Failsave point during startup, removed USB related drivers, ensured "ifconfig eth0 down". Could not ssh to the box, used only the console this time. Transferred a rootfs image via z-modem to test UBI flashing. Result: all NAND and UBI operations work ok; ubiformat, ubiupdatevol, "nanddump /dev/mtd12 >/dev/null" does not show any errors afterwards! (I had to create UBI device nodes manually; guess because some sys process was not running. "ubinfo -a" shows which dev nodes are needed, so this is no problem). So my NAND is ok, there is s.th. else which is interferring...

arnysch commented 5 years ago

FYI: the culprit is ... Tada Most probable it is: PACKAGE_kmod-fbtft-eb904, the LCD kernel driver. Uses the EBU bus without proper locking. It is not some network driver stuff (my 1st guess), nor SMP (2nd assumption). When using the LCD, without SMP, all works ok. Guess this is because EBU usage then is serial; either NAND, or LCD, but not both at same time. Going to have a look at fb_ili9341_eb904.c. Maybe using spinlocking helps...

Quallenauge commented 5 years ago

Damn.... :-/ I never thought about that.... Mostly the code comes from the uboot source code drop. The display data is constantly written to a magic memory address I kanged from the uboot sources. I have no idea what to lock at this point, I'm curious for possible solutions. I also have to admit, that I have not many knowledge in this area :-/

Quallenauge commented 5 years ago

Another/Additional idea: Maybe the initialization process isn't correct too and the newest patches from kovz does the job much better?!

arnysch commented 5 years ago

Unfortunately I don't know details about addressing and mapping and ebu stuff. I am only reasoning:

kovz commented 5 years ago

Some time ago, on old openwrt forum, I have posted a link to a datasheet with EBU functional and registers description. Will try find this document in the evening.

kovz commented 5 years ago

"newest patches from kovz": I understand you mean his pull request #14. These contain config/dts changes, no additional code, don't they?

This new functionality requires changes in both repositories. There are dts and config changes in this repository, and additional changes in touchkeypad. But this changes only about gpio extender, nothing were done with EBU.

Quallenauge commented 5 years ago

Ah my fault... I mixed up things :-/

Quallenauge commented 5 years ago

@arnysch : Some objections to not merge this one?

arnysch commented 5 years ago

Heck, not at all, of course. (luckily no liquor here at the moment, otherwise I would get drunk celebrating these merges)