kempniu / yafut

Yet Another File UTility
GNU General Public License v2.0
4 stars 3 forks source link

ath79 NOR yaffs cannot mount #4

Closed john-tho closed 1 year ago

john-tho commented 1 year ago

Low / no priority. Ticket more for my notes than a request to change anything. I will dig in later and see what I need to change to fix.

In the spirit of expanding use, I tried yafut on an ath79 NOR device. I noticed two issues:

yafut -v -d /dev/mtd8 -r -i bootimage -o /tmp/bootimage
mtd.c:359: init_yaffs_dev: /dev/mtd8: type=03, flags=00000c00, size=002c0000, erasesize=00010000, writesize=00000001, oobsize=00000000
mtd.c:497: mtd_mount: unable to mount Yaffs filesystem: error -12 (Out of memory)
copy.c:269: copy_file: unable to mount MTD: error -12 (Out of memory)

YAFFS_TRACE_MASK=0xffffffff
yaffs: yaffs: yaffs_ll_init()
yaffs: NAND geometry problems: chunk size 1, type is yaffs, inband_tags 0

root@OpenWrt:~# hexdump -C -n $((0x30)) /dev/mtd8
00000000  00 00 00 01 00 00 00 01  ff ff 62 6f 6f 74 69 6d  |..........bootim|
00000010  61 67 65 00 00 00 00 00  00 00 00 00 00 00 00 00  |age.............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030

root@OpenWrt:~# mtd_debug info /dev/mtd8
mtd.type = MTD_NORFLASH
mtd.flags = MTD_CAP_NORFLASH
mtd.size = 2883584 (2M)
mtd.erasesize = 65536 (64K)
mtd.writesize = 1 
mtd.oobsize = 0 
regions = 0

cat /sys/class/mtd/mtd8/erasesize_minor 
4096

free
              total        used        free      shared  buff/cache   available
Mem:          56144       17972       22352       14760       15820       10596
Swap:             0           0           0
kempniu commented 1 year ago

AFAICT, supporting NOR flash should be possible, but it would require some work because MEMREAD/MEMWRITE ioctls do not support MTDs without OOB regions. These ioctls would have to be replaced with pread()/pwrite() calls - which is simple enough, but the tricky part here is figuring out what Yaffs file system parameters to use as they can be pretty arbitrary given that NOR flash is byte-addressable.

I don't have an immediate idea for heuristics that would reliably detect the parameters used for an existing Yaffs file system living on NOR flash. I guess some sort of search for reasonably-looking Yaffs tags at typically used offsets could be a start, but I prefer to keep things simple and adding such logic to a file copy utility feels like a bit of an overkill.

Instead, I would rather allow the user to specify Yaffs parameters (chunk size, erase size) via command-line options, falling back to the defaults used by mkyaffs2image.c.

This would enable "external" brute-force testing if the user felt like giving that a shot, e.g. try a set of parameters, check if a file expected to exist in the file system was successfully read; if not, try a different set of parameters; rinse and repeat. At the same time, it would not complicate Yaffs source code much, which I like.

john-tho commented 1 year ago

AFAICT, supporting NOR flash should be possible, but it would require some work because MEMREAD/MEMWRITE ioctls do not support MTDs without OOB regions. These ioctls would have to be replaced with pread()/pwrite() calls

I had not dug into how yafut works enough to understand the plumbing to be able to even consider this. Thanks.

Instead, I would rather allow the user to specify Yaffs parameters (chunk size, erase size) via command-line options

Yes, that was going to be my first step, NOR yaffs settings too arbitrary, use user-supplied yaffs geometry.

f00b4r0 commented 1 year ago

Quick (barely) related note: if we ever get to the point where we no longer need kernel2minor, it would be interesting to revisit this comment from mtdsplit_minor.c. We may be able to squeeze a little more space from the flash on NOR devices then :)

kempniu commented 1 year ago

@john-tho, could you by any chance share a dump of the NOR flash? dd if=/dev/mtd8 of=mtd8.bin should do, if I am not mistaken.

john-tho commented 1 year ago

OpenWrt kernel partition for an ath79 RouterBOARD wAP G-5HacT2HnD wapg_kernel_bootimage.zip Using the RouterBOOT v6/v7 NPK bootimage: https://github.com/openwrt/openwrt/compare/master...john-tho:openwrt:routerboot-v7


Historic boot uses an ELF at kernel: ramips 760igs kernel: mipsel_hexs_kernel.img.zip


There is an ipq40xx OEM NOR dump here: https://forum.openwrt.org/t/support-for-mikrotik-hap-ac2/23333/5 YAFFS partition starts after at 0x100000

Cheers

kempniu commented 1 year ago

Thanks, @john-tho. Unfortunately, the last link above, which would have been the most interesting one for me, no longer seems to work, but I still managed to prepare something that I'm cautiously optimistic about. Could you perhaps give the support-nor-flash branch a spin (it is currently at 26980d416eef31178bee8d6c9901c7aaefc21372)?

I would start with checking whether this invocation produces something that looks sane in /tmp/bootimage:

yafut -v -d /dev/mtd8 -r -i bootimage -o /tmp/bootimage -C 1040 -B 64k -E

(The last three switches should enable the tool to follow the slightly funky Yaffs layout used by MikroTik devices with NOR flash.)

Is it by any chance this file?

$ stat -c %s bootimage 
2827540
$ sha256sum bootimage 
d16f4e851498cb93066d8fc208a8293010dc1ed67cec4a47058a94e46bf268d9  bootimage

If it is, and assuming that /dev/mtd8 is a minimally-sized OpenWRT kernel partition, you won't be able to write bootimage back to that MTD using -w; attempts to do so will break the bootimage file. I hope it goes without saying at this point, but please be prepared to reflash the device in case of trouble.

For the time being, write tests would need to be performed on a larger MTD partition. I could explain why, but let's not get ahead of ourselves ;)

john-tho commented 1 year ago

Unfortunately, the last link above, which would have been the most interesting one for me, no longer seems to work

Sorry, should have checked. I emailed you that hapac2 OEM firmware partition NOR dump, plus a pair of mine. Interesting side note there, not relevant for yafut, but I think Mikrotik may intentionally not put the kernel at start of partition. On my rb5009, I netinstalled, then netboot OpenWrt before RouterOS booted. yafut read (it matched the checksum of kernel extracted from NPK), wrote, then, reread to confirm good write of kernel. First boot of RouterOS worked fine, then reboot from within RouterOS. Next boot failed. Looking at the dump, it looked like RouterOS tried to write some files at the start of that YAFFS part, overwriting my early yafut installed kernel.

I still managed to prepare something that I'm cautiously optimistic about.

Nice work. Good read.

It appears to segfault when displaying data_len

root@OpenWrt:~# opkg install yafut_2023-04-05-26980d41-1_mips_24kc.ipk 
Upgrading yafut on root from 2023-03-15-5f901d68-1 to 2023-04-05-26980d41-1...
Configuring yafut.
root@OpenWrt:~# yafut -v -d /dev/mtd8 -r -i bootimage -o /tmp/bootimage -C 1040 -B 64k -E
mtd.c:162: discover_mtd_parameters: /dev/mtd8: type=3, flags=0x00000c00, size=2883584, erasesize=65536, writesize=1, oobsize=0, oobavail=0
mtd.c:197: init_yaffs_geometry: /dev/mtd8: NOR flash detected
mtd.c:198: init_yaffs_geometry: /dev/mtd8: using default chunk size of 2048 bytes
mtd.c:200: init_yaffs_geometry: /dev/mtd8: using default block size of 131072 bytes
mtd.c:213: init_yaffs_geometry: /dev/mtd8: overriding chunk size to 1040 bytes
mtd.c:219: init_yaffs_geometry: /dev/mtd8: overriding block size to 65536 bytes
mtd.c:278: init_yaffs_dev: /dev/mtd8: total_bytes_per_chunk=1040, chunks_per_block=63, spare_bytes_per_chunk=0, end_block=43, is_yaffs2=1, inband_tags=1, no_tags_ecc=1
ydrv.c:299: ydrv_read_chunk_nor: pread, chunk=0, offset=0 (0x00000000), data=0 (0Segmentation fault

Matches your expected values without the -v flag:

root@OpenWrt:~# yafut -d /dev/mtd8 -r -i bootimage -o /tmp/bootimage -C 1040 -B 64k -E
root@OpenWrt:~# echo $?
0
root@OpenWrt:~# ls /tmp/bo
board.json  bootimage
root@OpenWrt:~# ls /tmp/bo
board.json  bootimage
root@OpenWrt:~# ls /tmp/bootimage 
/tmp/bootimage
root@OpenWrt:~# stat -c %s /tmp/bootimage 
-ash: stat: not found
root@OpenWrt:~# ls -altr /tmp/bootimage 
-rw-r--r--    1 root     root       2827540 Apr  6 01:06 /tmp/bootimage
root@OpenWrt:~# sha256sum /tmp/bootimage 
d16f4e851498cb93066d8fc208a8293010dc1ed67cec4a47058a94e46bf268d9  /tmp/bootimage

yafut-2698-trace.zip

I hope it goes without saying at this point, but please be prepared to reflash the device in case of trouble.

Yes, as expected, no worries here.

I had applied Mikrotik's RouterOS GPL patch to Linux. I had not looked to see how the yaffs code might be different to standard YAFFS: https://github.com/john-tho/linux/tree/5.6.3-routeros/fs/yaffs2 From: https://forum.openwrt.org/t/add-support-for-mikrotik-rb5009ug/104391/108 or https://box.mikrotik.com/d/81912835977544a291c9/

Cheers,

f00b4r0 commented 1 year ago

Unfortunately, the last link above, which would have been the most interesting one for me, no longer seems to work

Sorry, should have checked. I emailed you that hapac2 OEM firmware partition NOR dump, plus a pair of mine. Interesting side note there, not relevant for yafut, but I think Mikrotik may intentionally not put the kernel at start of partition.

At least on ath79, that's indeed never the case. That's the reason why we have to completely wipe the entire firmware partition when installing from initramfs, otherwise routerboot will first find the old kernel signature and try to boot that, resulting in the dreaded boot loop.

kempniu commented 1 year ago

Sorry, should have checked. I emailed you that hapac2 OEM firmware partition NOR dump, plus a pair of mine. Interesting side note there, not relevant for yafut, but I think Mikrotik may intentionally not put the kernel at start of partition. On my rb5009, I netinstalled, then netboot OpenWrt before RouterOS booted. yafut read (it matched the checksum of kernel extracted from NPK), wrote, then, reread to confirm good write of kernel. First boot of RouterOS worked fine, then reboot from within RouterOS. Next boot failed. Looking at the dump, it looked like RouterOS tried to write some files at the start of that YAFFS part, overwriting my early yafut installed kernel.

Thanks, @john-tho, this is all very interesting information. Purely in the interest of preventing scope creep for this issue, I am focusing on this part for now:

yafut read (...), wrote, then, reread to confirm good write of kernel. First boot of RouterOS worked fine

:-)

It appears to segfault when displaying data_len

Oops, this was a data type issue for offset variables that caused logging functions processing variadic arguments to run amok on some platforms. Much fun, very C.

Could you please give 5124b7085e2287bdc14898c3146717dcd35a164c a shot with -v to make sure it does not segfault any more? It should also give you the same SHA-256 checksum for bootimage, of course :-)

If this works, I will chop up the support-nor-flash branch into smaller chunks, merge it, close this issue, and only then look at the intricacies of the way MikroTik devices handle the firmware partition.

john-tho commented 1 year ago

Could you please give 5124b70 a shot with -v to make sure it does not segfault any more? It should also give you the same SHA-256 checksum for bootimage, of course :-)

-v worked fine, and output as expected with 5124b

yafut-5124-log-ath79.zip

I also ran compiled and ran a yafut read on my 6.1 testing mt7621 (little endian) device. No errors shown, and the resulting bootimage output passes basic sanity checks.

Cheers

kempniu commented 1 year ago

Awesome, thanks again for all the testing, @john-tho :heart:

main now supports reading from and writing to NOR flash.

john-tho commented 1 year ago

Hi Michał, Another not-an-issue / no priority. Feel free to (re)move this as you wish. Looking to document what we want to be able to do to use yafut as part of a NOR sysupgrade process.

I did some testing manually replacing kernel on a NOR device after you merged those NOR changes. Some time ago, so hopefully not misremembering too much.

I don't think I tried to erase kernel first, then yafut write new kernel to parent firmware (without checkpointing). I should check where Yaffs writes in that situation.

Cheers

kempniu commented 1 year ago

Hi @john-tho,

I need to do some NOR experiments before moving on with this work, so that I have a better understanding of all the moving parts. I hope to have some time for this in the September-October time frame.

Given that there is a new objective:

Looking to document what we want to be able to do to use yafut as part of a NOR sysupgrade process.

I would definitely prefer to at least go with a separate GitHub issue as this one is already closed. However, since this feels a bit like an OpenWRT-specific topic, I might create an issue in the OpenWRT project when the time comes to discuss design ideas.

Thanks for sharing your experiences above.

kempniu commented 1 year ago

@john-tho: https://github.com/openwrt/openwrt/pull/13453 sounds like a reasonable place to discuss this further.