amouiche / qnap_mtd_resize_for_bullseye

Script for resizing MTD partitions on a QNAP device in order to be able to upgrade from buster to bullseye
GNU General Public License v2.0
51 stars 11 forks source link

Jumping straight from QTS to bookworm #51

Open graemev opened 7 months ago

graemev commented 7 months ago

I migrated my TS412 to stretch, to bullseye (using saboteur layout) then switched to the Mouiche layout of bullseye (using https://github.com/graemev/qnap_mtd_resize_for_bullseye_saboteur_mod [pull request exists]) , then finally to bookworm using Mouiche layout) ...when BANG the motherboard died. Reset + Power does not beep ... nothing on serial console ( https://forum.qnap.com/viewtopic.php?p=853963#p853963 ) which had been working fine.

But miracles of miracles I get offered a new board... Installed and it's running QTS. Now I have various HDD with bookworm installed and I have a number of Pixe images that can be installed. [ https://forum.qnap.com/viewtopic.php?t=171411 ] .

So I set:

setenv   bootcmd uart1 0x68\;cp.l 0xf8100000 0x800000 0xc0000\;cp.l 0xf8400000 0xb00000 0x300000\;bootm 0x800000\;echo Kernel_legacy layout fallback\;bootm 0x900000

setenv   bootargs console=ttyS0,115200 root=/dev/ram initrd=0xb00000,0xc00000 ramdisk=34816 cmdlinepart.mtdparts=spi0.0:512k@0(uboot)ro,3M@0x100000(Kernel),12M@0x400000(RootFS1),2M@0x200000(Kernel_legacy),256k@0x80000(U-Boot_Config),256k@0xc0000(NAS_Config) mtdparts=spi0.0:512k@0(uboot)ro,3M@0x100000(Kernel),12M@0x400000(RootFS1),2M@0x200000(Kernel_legacy),256k@0x80000(U-Boot_Config),256k@0xc0000(NAS_Config)

savenev

I need to remove JP1 before reset+power on works (so I lose the console) seems like JP1 selects an "engineering mode" and the reset does not work in that mode?

Then proceed to do reset + power on, long beep ...watch via wireshark as it installs via TFTP then a double beep .. but no system.... running "boot" from u-boot console (having reinstalled JP1)

It fails with

Marvell>> boot
Unknown command 'uart1' - try 'help'
## Booting image at 00800000 ...
Bad Magic Number
Kernel_legacy layout fallback
## Booting image at 00900000 ...
Bad Magic Number

Looking at 0080000, 0090000 and also f8100000 , they are zeros ... checking the same locations on the PiXE image, they look good (2705 1956) ...so the data is not getting into the Flash memory via the TFTP.... I was using this mechanism with my old board just fine ....

So I'm wondering, by going direct from QTS to bookworm , with no Debian install have I missed e.g turning off some flash protection ? So the data is not getting written?

amouiche commented 7 months ago

Hi.

 it installs via TFTP

Can you detail the uboot cmd you use to do such operation ?

Arnaud

On Wed, 2023-11-15 at 13:50 -0800, graemev wrote:

I migrated my TS412 to stretch, to bullseye (using saboteur layout) then switched to the Mouiche layout of bullseye (using https://github.com/graemev/qnap_mtd_resize_for_bullseye_saboteur_mod [pull request exists]) , then finally to bookworm using Mouiche layout) ...when BANG the motherboard died. Reset + Power does not beep ... nothing on serial console ( https://forum.qnap.com/viewtopic.php?p=853963#p853963 ) which had been working fine. But miracles of miracles I get offered a new board... Installed and it's running QTS. Now I have various HDD with bookworm installed and I have a number of Pixe images that can be installed. [ https://forum.qnap.com/viewtopic.php?t=171411 ] . So I set: setenv bootcmd uart1 0x68\;cp.l 0xf8100000 0x800000 0xc0000\;cp.l 0xf8400000 0xb00000 0x300000\;bootm 0x800000\;echo Kernel_legacy layout fallback\;bootm 0x900000

setenv bootargs console=ttyS0,115200 root=/dev/ram initrd=0xb00000,0xc00000 ramdisk=34816 @.**@*.**@*.**@*.**@*.**@.(NAS_Config) @.**@*.**@*.**@*.**@*.**@.(NAS_Config)

savenev

I need to remove JP1 before reset+power on works (so I lose the console) seems like JP1 selects an "engineering mode" and the reset does not work in that mode? Then proceed to do reset + power on, long beep ...watch via wireshark as it installs via TFTP then a double beep .. but no system.... running "boot" from u-boot console (having reinstalled JP1) It fails with Marvell>> boot Unknown command 'uart1' - try 'help'

Booting image at 00800000 ...

Bad Magic Number Kernel_legacy layout fallback

Booting image at 00900000 ...

Bad Magic Number

Looking at 0080000, 0090000 and also f8100000 , they are zeros ... checking the same locations on the PiXE image, they look good (2705 1956) ...so the data is not getting into the Flash memory via the TFTP.... I was using this mechanism with my old board just fine .... So I'm wondering, by going direct from QTS to bookworm , with no Debian install have I missed e.g turning off some flash protection ? So the data is not getting written? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

graemev commented 7 months ago

I guess I was unclear WRT "it installs via TFTP"

I was referring to the process , whereby:

  1. I remove JP1
  2. I press and hold the reset pin
  3. I power on the TS412

The TS412 then performs a variant of the pixie boot (but storing the image rather than just booting it)

(you describe it in: Recovery and I do so in: PiXe Recovery )

I have a number of stored images, the one I'm initially using is called

F_TS-412-MOUICHE_bookworm_6.1.0-13-marvell

It's symlinked in /srv/tftp as

F_TS-412-MOUICHE_BOOKWORM -> F_TS-412-MOUICHE_bookworm_6.1.0-13-marvell I've used this setup (but not this exact image) many times to restore my old board TS412.

The image was created with:


graeme@real:~/QNAP412-WIP/The-grab-scripts$ cat grab-all-mdt-with-mouiche.sh 
modprobe mtdblock
cat /dev/mtd0 > mtd0
cat /dev/mtd1 > mtd1
cat /dev/mtd2 > mtd2

cat /dev/mtd4 > mtd4
cat /dev/mtd5 > mtd5

. /etc/os-release 
echo $VERSION_CODENAME

KERNEL=$(uname -r)

NAME="F_TS-412-MOUICHE_${VERSION_CODENAME}_${KERNEL}"

#cat mtd0 mtd4 mtd5 mtd1 mtd2 mtd3 > ${NAME}  # This is layout on Original & SABOTEUR

cat mtd0 mtd4 mtd5 mtd1 mtd2       > ${NAME}  # This is layout with MOUICHE

I traced PiXE process with wireshark and I see:

Dynamic Host Configuration Protocol (Offer)
...
    Boot file name: F_TS-412-MOUICHE_BOOKWORM
    Magic cookie: DHCP
    Option: (53) DHCP Message Type (Offer)
...
    Padding: 000000000000

Then coming back a tftp request:

Trivial File Transfer Protocol
    Opcode: Read Request (1)
    Source File: F_TS-412-MOUICHE_BOOKWORM
    Type: octet
    Option: timeout = 5

It takes a few minutes copying , then double beeps but does not come up ... getting back to the uboot prompt it does not look as is a good image is present (you may have a classic "tell" to look for?)

I'm reasonably unfamiliar with uboot "shell" commands I'm finding it quite hard to get any documentation of them. I've found references but they often simply restate the parameters .

e.g. bootp [loadAddress] [bootFilename] It does not say but I assume it store the file at loadAddress then jumps to loadAddress ...so it would have to be a kernel image? [ ABTW, I guess one could make a "diskless" boot QNAP like this ... useful given how fragile the flash memory appears to be]

Marvell>> version

U-Boot 1.1.4 (Oct 27 2010 - 16:50:30) Marvell version: 3.4.4

But my current guess is that the QNAP has "protected" some of the flash, so the writes are failing. During a "normal" Debian install this protection is turned off.

amouiche commented 7 months ago

On Thu, 2023-11-16 at 04:21 -0800, graemev wrote:

I guess I was unclear WRT "it installs via TFTP" I was referring to the process , whereby:

  1. I remove JP1
  2. I press and hold the reset pin
  3. I power on the TS412 The TS412 then performs a variant of the pixie boot (but storing the image rather than just booting it) (you describe it in: Recovery and I do so in: PiXe Recovery ) I have a number of stored images, the one I'm initially using is called F_TS-412-MOUICHE_bookworm_6.1.0-13-marvell It's symlinked in /srv/tftp as F_TS-412-MOUICHE_BOOKWORM -> F_TS-412-MOUICHE_bookworm_6.1.0-13- marvell I've used this setup (but not this exact image) many times to restore my old board TS412. The image was created with: @.***:~/QNAP412-WIP/The-grab-scripts$ cat grab-all-mdt-with-mouiche.sh modprobe mtdblock cat /dev/mtd0 > mtd0 cat /dev/mtd1 > mtd1 cat /dev/mtd2 > mtd2 cat /dev/mtd4 > mtd4 cat /dev/mtd5 > mtd5 . /etc/os-release echo $VERSION_CODENAME KERNEL=$(uname -r) NAME="FTS-412-MOUICHE${VERSIONCODENAME}${KERNEL}" #cat mtd0 mtd4 mtd5 mtd1 mtd2 mtd3 > ${NAME} # This is layout on Original & SABOTEUR cat mtd0 mtd4 mtd5 mtd1 mtd2 > ${NAME} # This is layout with MOUICHE I traced PiXE process with wireshark and I see: Dynamic Host Configuration Protocol (Offer) ... Boot file name: F_TS-412-MOUICHE_BOOKWORM Magic cookie: DHCP Option: (53) DHCP Message Type (Offer) ... Padding: 000000000000 Then coming back a tftp request: Trivial File Transfer Protocol Opcode: Read Request (1) Source File: F_TS-412-MOUICHE_BOOKWORM Type: octet Option: timeout = 5 It takes a few minutes copying , then double beeps but does not come up ... getting back to the uboot prompt it does not look as is a good image is present (you may have a classic "tell" to look for?) I'm reasonably unfamiliar with uboot "shell" commands I'm finding it quite hard to get any documentation of them. I've found references but they often simply restate the parameters . e.g. bootp [loadAddress] [bootFilename] It does not say but I assume it store the file at loadAddress then jumps to loadAddress ...so it would have to be a kernel image? [ ABTW, I guess one could make a "diskless" boot QNAP like this ... useful given how fragile the flash memory appears to be] Marvell>> version U-Boot 1.1.4 (Oct 27 2010 - 16:50:30) Marvell version: 3.4.4 But my current guess is that the QNAP has "protected" some of the flash, so the writes are failing. During a "normal" Debian install this protection is turned off.

You should be able to do all the "tftp download and flash erase+write manually" with "sflash"  and " tftpboot" uboot commands If you want to use sflash. be careful to not erase/write the first 512KB (or first 2 flash sectors). Otherwise you will brick the bootloader. => DON'T USE "sflash erase all"

"sflash info" tells you if your flash is write protected or not.  In order to correctly write a flash content, you must erase the sectors where you will write first.

"sflash protect off" may disable the protection if there is one

ex to flash a kernel image at offset 0x100000 of the flash, you can do something like tftpboot 0x800000 some_file sflash erase 4-15          => erase sectors 4 (offset = 0x100000 bytes) to 16 (excluded), so 3MB in total.  This operation is looooong sflash write 0x800000 0x100000 0x300000    => write from RAM 0x800000 to flash offset 0x100000 @.***) with a size of 3MB

Arnaud

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.*** m>

graemev commented 7 months ago

1st:

Marvell>> sflash info

Flash Base Address  : 0xf8000000
Flash Model         : ST M25P128
Manufacturer ID     : 0x20
Device Id           : 0x2018                                                    
Sector Size         : 256K                                                      
Number of sectors   : 64                                                        
Page Size           : 256                                                       
Write Protection    : Off           

Scotches that theory :-(

2nd: I don't have a "map" of RAM (memory in general) in this machine. I was going to ask where a good place (RAM) is to tftp the file(s) when it occurred to me their final target locations would be good (+ a better test ...) I'm obviously nervous of manually flashing addresses (this being my 2nd motherboard!) ...when it occurred to me.

If I tftp the kernel+initrd into the correct locations in ram ... I could just boot ? the $bootargs would need to be correct ...but that's the same one I need in anycase (and have already set) ...so I could "just" tftp the files in then "bootm"

If I get a working system, I can just use flash_kernel to write the new kernel an initrd flash?

Does this sound right (an a bit safer) ?

ABTW: does it seem a good idea to (flash) protect 0x000000-0x0BFFFF ?

amouiche commented 7 months ago

On Thu, 2023-11-16 at 08:37 -0800, graemev wrote:

1st: Marvell>> sflash info Flash Base Address : 0xf8000000 Flash Model : ST M25P128 Manufacturer ID : 0x20 Device Id : 0x2018
Sector Size : 256K
Number of sectors : 64
Page Size : 256
Write Protection : Off
Scotches that theory :-( 2nd: I don't have a "map" of RAM (memory in general) in this machine. I was going to ask where a good place (RAM) is to tftp the file(s) when it occurred to me their final target locations would be good (+ a better test ...) I'm obviously nervous of manually flashing addresses (this being my 2nd motherboard!) ...when it occurred to me. bootcmd is using RAM starting at 0x800000 to copy images from the flash, then to bootm into them. You can use this address as scratch area for any usage. I guess RAM address starts at 0, but the uboot bootloader may be loaded/use the start of the RAM for its own usage. If I tftp the kernel+initrd into the correct locations in ram ... I could just boot ? the $bootargs would need to be correct ...but that's the same one I need in anycase (and have already set) ...so I could "just" tftp the files in then "bootm" If I get a working system, I can just use flash_kernel to write the new kernel an initrd flash? Yes, you can. just fake the 'bootcmd' replacing 'cp' by tftpboot loads and its done.

if boocmd is:  uart1 0x68\;cp.l 0xf8100000 0x800000 0xc0000\;cp.l 0xf8400000 0xb00000 0x300000\;bootm 0x800000\;echo Kernel_legacy layout fallback\;bootm 0x900000

you can do something like uart1 0x68 tftpboot 0x800000 kernel_image_file tftpboot 0xb00000 inird_image_file bootm 0x800000

Does this sound right (an a bit safer) ? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.*** m>

graemev commented 7 months ago

Thanks for that. I initially tried just tftp ing the whole image to 0x700000 (so the kernel would end up at 0x800000 ...but the tftp did not complete (I concluded 0x700000-0x7FFFFF was being used by uboot ...anyhow did it in 2 parts as you suggested (after fixing serverip, ipaddr and netmask)


Marvell>> uart1 0x68
Unknown command 'uart1' - try 'help'
Marvell>> tftpboot 0x800000 mtd1
Using egiga0 device
TFTP from server 10.117.0.152; our IP address is 10.117.1.232
Filename 'mtd1'.
Load address: 0x800000
Loading: #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         ##############################
done
Bytes transferred = 3145728 (300000 hex)
Marvell>> tftpboot 0xb00000 mtd2
Using egiga0 device
TFTP from server 10.117.0.152; our IP address is 10.117.1.232
Filename 'mtd2'.
Load address: 0xb00000
Loading: #################################################################
         ####.....many such

         #####################################################
done
Bytes transferred = 12582912 (c00000 hex)
Marvell>> 

I was able to boot this ... need to sort out a few issues with IP addresses etc but this looks like a workable situation ... many thanks.

The UART seems less stable than my old motherboard, often need to power of the QNAP to get it working (changed cables, USB Gadget etc, seem to be motherboard)

I've yet to flash the kernel. Given my bad experience with my old board I'm treading carefully . I wrote a couple of quick scripts to verify the FLASH . The 2nd time I ran it , the result was worrying:

root@qnap412u:/home/graeme/NEW-MOTHERBOARD# ./validate-flash                                                                         
mtd0.orig: OK                                                                                                                        
/dev/mtd0 matches the saved copy                                                                                                     
mtd1.orig: OK                                                                                                                        
/dev/mtd1 matches the saved copy                                                                                                     
mtd2.orig: OK                                                                                                                        
/dev/mtd2 mtd2.orig differ: byte 4097, line 15                                                                                       
******* /dev/mtd2 differs from the saved copy                                                                                        
mtd4.orig: OK                                                                                                                        
/dev/mtd4 matches the saved copy                                                                                                     
mtd5.orig: OK                                                                                                                        
/dev/mtd5 matches the saved copy                                                                                                     
The tests failed, DO NOT REBOOT without resolving the FLASH status                                                                   

I ran it 2 more times and it reported OK, so it was just one occasion where the FLASH misread, this is however, worrying.

root@qnap412u:/home/graeme/NEW-MOTHERBOARD# cat ./validate-flash                                                                     
#! /bin/bash -ue                                                                                                                     

OK="y"                                                                                                                               

for i in 0 1 2 4 5                                                                                                                   
do                                                                                                                                   
    if ( cksum -c mtd${i}.cksum ) ; then                                                                                             
        if (cmp /dev/mtd${i}  mtd${i}.orig) ; then                                                                                   
            echo "/dev/mtd${i} matches the saved copy"                                                                               
        else                                                                                                                         
            echo "******* /dev/mtd${i} differs from the saved copy"                                                                  
            OK="n"                                                                                                                   
        fi                                                                                                                           
    else                                                                                                                             
        echo "Bad checksum on saved mtd${i}"                                                                                         
    fi                                                                                                                               
done                                                                                                                                 

if [[ "${OK}" != "y" ]] ; then                                                                                                       
    echo "The tests failed, DO NOT REBOOT without resolving the FLASH status"                                                        
    exit 1                                                                                                                           
fi                                                                                                                                   

exit 0                                                                                                                               

I guess this concludes this issue. Many thanks for your help. Do you find the hardware as fragile as mine appears or am I just unlucky?

graemev commented 7 months ago

AHHH!! Finally cracked it

The line in your doc:

When reset button is pressed continuously during the boot, u-boot simply:

download a image file using TFTP from IP 192.168.0.1 (it is using address 192.168.0.65 itself)
Erase and program flash from adress 0x200000 to the end using the image downloaded (also starting at offset 0x200000 of the image file)

So it takes offset 0x200000 into F_TS-412-MOUICHE_bullseye_5.10.0-23-marvell (which is mid way into kernel) and writes it to 00F820000 ...but in the MOUICHE layout the kernel starts at 00F810000.

Checking:

So the data @ MTD1+00100000                                    is b9df 27f8 df50 13e6
The data @F_TS-412-MOUICHE_bullseye_5.10.0-23-marvell+00200000 is b9df 27f8 df50 13e6

So, in effect, once you switch to MOUICHE layout you can no longer use the PiXE recovery. I'm wondering if I want to switch (back) to the SABOTEUR layout. Since I've now found it's possible to reduce the initrd to about 8MB (as described QNAP and Stackexchange ) ...or even another combination with 11MB rootfs1 + 3MB kernel starting at 200000.

ABTW, I note uboot has: loadaddr=0x02000000