topjohnwu / Magisk

The Magic Mask for Android
GNU General Public License v3.0
48.43k stars 12.37k forks source link

magiskboot can't dump /dev/bootimg (NAND device) #1526

Closed Msprg closed 4 years ago

Msprg commented 5 years ago

Hi, I want to install magisk to my older tablet, but the installation of the zip fails with something like it does no know how to patch my boot.img, and I could provide you with my boot.img. So here I am. Dev: Lenovo A7-40 (Lenovo a3500-fl), running stock Android Kit-Kat 4.4.2.

boot.img: https://drive.google.com/open?id=1xtBvCOUIJIMnGav6I09jT__92IlObSu9

Thank you!

topjohnwu commented 5 years ago

@osm0sis may you check this out for me? If this is nothing trivial to fix I'll simply close it

osm0sis commented 5 years ago

@topjohnwu, it's MTK and magiskboot itself unpacks it just fine:

Parsing boot image: [boot.img]
HEADER_VER      [0]
KERNEL_SZ       [3418456]
RAMDISK_SZ      [828601]
SECOND_SZ       [0]
EXTRA_SZ        [0]
RECOV_DTBO_SZ   [0]
DTB             [0]
PAGESIZE        [2048]
NAME            [23]
CMDLINE         []
CHECKSUM        [85f1358c0be651ec5c99d0ea98a5d8377db48641]
MTK_KERNEL_HDR
KERNEL          [3417944]
NAME            [KERNEL]
MTK_RAMDISK_HDR
RAMDISK         [828089]
NAME            [ROOTFS]
KERNEL_FMT      [raw]
RAMDISK_FMT     [gzip]
:/data/local/tmp #

Also appears to patch just fine in latest Magisk Manager Canary:

- Existing zip found
- Copying image to cache
1022+1 records in
1022+1 records out
1047368 bytes (1022.8KB) copied, 0.003241 seconds, 308.2MB/s
- Unpacking boot image
Parsing boot image: [/data/user_de/0/com.zImlCG.k1ujOAAnM/install/boot.img]
HEADER_VER      [0]
KERNEL_SZ       [3418456]
RAMDISK_SZ      [828601]
SECOND_SZ       [0]
EXTRA_SZ        [0]
RECOV_DTBO_SZ   [0]
DTB             [0]
PAGESIZE        [2048]
NAME            [23]
CMDLINE         []
CHECKSUM        [85f1358c0be651ec5c99d0ea98a5d8377db48641]
MTK_KERNEL_HDR
KERNEL          [3417944]
NAME            [KERNEL]
MTK_RAMDISK_HDR
RAMDISK         [828089]
NAME            [ROOTFS]
KERNEL_FMT      [raw]
RAMDISK_FMT     [gzip]
- Checking ramdisk status
Loading cpio: [ramdisk.cpio]
- Stock boot image detected
- Backing up stock boot image
Loading cpio: [ramdisk.cpio]
- Patching ramdisk
Add entry [init] (0750)
Patch with flag KEEPVERITY=[true] KEEPFORCEENCRYPT=[true]
Loading cpio: [ramdisk.cpio.orig]
Backup mismatch entry: [init] -> [.backup/init]
Create directory [.backup] (0000)
Add entry [.backup/.magisk] (0000)
Dump cpio: [ramdisk.cpio]
- Repacking boot image
Parsing boot image: [/data/user_de/0/com.zImlCG.k1ujOAAnM/install/boot.img]
HEADER_VER      [0]
KERNEL_SZ       [3418456]
RAMDISK_SZ      [828601]
SECOND_SZ       [0]
EXTRA_SZ        [0]
RECOV_DTBO_SZ   [0]
DTB             [0]
PAGESIZE        [2048]
NAME            [23]
CMDLINE         []
CHECKSUM        [85f1358c0be651ec5c99d0ea98a5d8377db48641]
MTK_KERNEL_HDR
KERNEL          [3417944]
NAME            [KERNEL]
MTK_RAMDISK_HDR
RAMDISK         [828089]
NAME            [ROOTFS]
KERNEL_FMT      [raw]
RAMDISK_FMT     [gzip]
Repack to boot image: [new-boot.img]
HEADER_VER      [0]
KERNEL_SZ       [3418456]
RAMDISK_SZ      [1059905]
SECOND_SZ       [0]
EXTRA_SZ        [0]
RECOV_DTBO_SZ   [0]
DTB             [0]
PAGESIZE        [2048]
NAME            [23]
CMDLINE         []
CHECKSUM        [c5c0f93db2040cac992ca71e086aa3c33dc76fce]

****************************
 Output file is placed in 
 /storage/emulated/0/Download/magisk_patched.img 
****************************
- All done!

So, hard to know with no logs, but.. user error?

Edit: Maybe it couldn't find the boot partition or something. @Msprg, you need to upload /tmp/recovery.log from directly after flashing the latest Magisk Canary zip, before rebooting.

Msprg commented 5 years ago

Hi, I am installing the magisk 19.2 via TWRP recovery, not tried to indirectly patch boot image yet, just direct install via recovery. My: recovery Magisk 19.2 release.log, recovery Magisk v19.3-76c88913 DEBUG.log (CANARY) recovery Magisk v19.3-76c88913 RELEASE.log (CANARY)

Edit: recovery output (for all three is the same):

Skipping Digest check: no Digest file found
*************************
* Magisk v19.2 Installer
*************************
- Mounting system
- Target image: /dev/bootimg
- Device platform: arm
- Constructing enviroment
- Unpacking boot image
!  Unsupported/Unknown image format
- Unmounting partitions
Updater process ended with ERROR: 1
Msprg commented 5 years ago

I sucessfully patched my boot.img with Magisk Manager 7.2.0 (213) // Magisk 19.2 (19200) indirectly:

- Device platform: armeabi-v7a
- Downloading zip
... 100%
- Copying image to cache
- Unpacking boot image
- Checking ramdisk status
- Stock boot image detected
- Backing up stock boot image
- Patching ramdisk
- Repacking boot image

***************************
 Output file is placed in
.../magisk_patched.img
***************************
- All done!

Flashed modified boot image through TWRP recovery, no bootloop, and... The Magisk Manager now crashes... every time I try to open it...

Msprg commented 5 years ago

Meanwhile did some progress. I remember usng superSU, but it was some time ago, and I recall that I have done full unroot from superSU. So I restored stock boot.img, and applied @osm0sis script to remove superSU//unroot. (Thank you!) Then I flashed modified boot.img by Magisk, and reboot. After bootup, the Magisk Manager app vanished, ( ...? ) so I installed it again by apk file (android installed it as update of an app not as new app ( ...? ) ). After that, Magisk Manager demanded some "additional setup" and rebooted.

It is looking to work fine for now so far.... We will see... Should I still send you the Magisk.log @osm0sis ?

osm0sis commented 5 years ago

Okay, so aside from the bootsigner issue in recovery @topjohnwu will need to check into, sounding like some user error overall and now you've got it working. :+1:

No magisk.log required now.

Msprg commented 5 years ago

Well, user error... I did the usual procedure, and properly uninstalled superSU, ( according to the maybe unofficial wiki is "full unroot" from within the SuperSU app enough...) and in the recovery tried to install the latest stable Magisk which failed on its own... I personally/subjectively, do not see user err in there 😄 but nevermind, at least I got somewhat working Magisk, so as I said we wil see, if it does keep working, or break somehow. Hopefully not 😄

Nevertheless, I believe in you guys, you did amazing work with not only Magisk, but other projects too. Thanks so far, and keep the good work up @topjohnwu, @osm0sis!

osm0sis commented 5 years ago

This should probably be reopened. It's actually failing with that /dev/bootimg location, so maybe something needs fixing in the partition detection script.

Msprg commented 5 years ago

Okay, so you gonna work on to fix that issue with locating the bootimg, or you want me to do something for you?

Also, shall I reopen right now then?

osm0sis commented 5 years ago

Yeah. My current guess is it's an issue with dumping your /dev/bootimg partition.

topjohnwu commented 5 years ago

@Msprg in case you're not aware, you can use the patch boot image method to install Magisk on your device.

Msprg commented 5 years ago

Hi @topjohnwu, I think you didn't read the whole thread, but nevermind, just check this: https://github.com/topjohnwu/Magisk/issues/1526#issuecomment-497340549 And maybe some other posts if you are interested and have time.

Anyways, thank you for responding!

HemanthJabalpuri commented 5 years ago

1103

Msprg commented 5 years ago

So it looks like we have some problems overall... It looks like no app can request superuser access. Normally when some app tries to get root privileges, the Magisk Manager app would run and display a confirmation dialog for me. What's happening to me, is that after requesting root privileges, nothing happens, and then app determine that the request was denied, even though it wasn't, but app didn't get granted su privileges, it is like if the request was denied.

Just: when I run the "su" command in the terminal emulator, it just hangs for about 3 minutes, and then doesn't get superuser anyways. I get no confirmation dialog from Magisk. And setting autoresponse to "allow" changes nothing.

osm0sis commented 5 years ago

Well, on the flashing front first, you guys need to figure out why it fails. Since you have Magisk installed manually right now this should be pretty easy to test.

In recovery go to terminal in advanced and type:

/data/adb/magisk/magiskboot unpack /dev/bootimg

And let us know what the output is.

Msprg commented 5 years ago

Okay, let´s see:

/data/adb/magisk # ./magiskboot unpack /dev/bootimg
Parsing boot image: [/dev/bootimg]
/data/adb/magisk # 

While there is not an explicit success, nor is the explicit error, so regardless what magiskboot did I think it succeeded. Also as you can see I did not called magiskboot by an absolute path, but I do not think this is an issue. If it is, please tell me.

TY.

osm0sis commented 5 years ago

Okay, well you probably shouldn't have unpacked it into /data/adb/magisk, hence the path, but sure. What's in /data/adb/magisk now after that command?

Msprg commented 5 years ago

Ah, so this it what the command does. :rofl: Well nevermind, I have rebooted the tablet meanwhile so maybe Magisk did some cleanup, but I tried the following:

/data/adb/magisk # ls
addon.d.sh
boot_patch.sh
busybox
chromeos        //actually a folder
magisk
magiskboot
magiskinit
util_functions.sh
/data/adb/magisk # mkdir unpacked_boot
/data/adb/magisk # cd unpacked_boot/
/data/adb/magisk/unpacked_boot # ../magiskboot unpack /dev/bootimg 
Parsing boot image: [/dev/bootimg]
/data/adb/magisk/unpacked_boot # ls

So IF I am doing this correctly, it looks like magiskboot unpacked ... nothing? I do not know what the recovery script in the magisk.zip actually does, but if it is calling this command as I just did, and it should output some file(s), it looks like magisk recovery script fails because there are no expected bootimg files present... it is possible? May be TWRP recovery error?

osm0sis commented 5 years ago

Yeah looks like it tries to parse/unpack it but then silently fails and unpacks nothing, (cc: @topjohnwu).

Next, can you do cat /dev/bootimg > /sdcard/boot-cat.img and dd if=/dev/bootimg of=/sdcard/boot-dd.img from recovery and then upload be boot-cat.img and boot-dd.img from your sdcard?

Msprg commented 5 years ago

@osm0sis On it, just do not know why the cat-ting :smile_cat: is taking forever to finish...

Oh, so this is why...

/dev # cat bootimg > /sdcard/bootimg_via_cat_MagiskPatched.img
cat: write error: No space left on device
/dev # 

Though I do not understand, I have a bit more space than 6MB (size of boot partition shown on TWRP backup screen). Nevermind, got to try to write it on an external_sd...

/dev # cat bootimg > /external_sd/bootimg_via_cat_MagiskPatched.img
cat: write error: No space left on device

Okay, so no way there, i will try now with dd...

/dev # dd if=/dev/bootimg of=/external_sd/bootimg_via_dd_MagiskPatched.img
^C
/dev # 

So I did not let dd to finish, because it already took more than 5 minutes, and I see that it will end up just filling my SD as cat did... So I see there is clearly a problem there at least I am giving you the TWRP backup of the boot partition. Hope it helps somehow. https://drive.google.com/open?id=1nNOm50pEKesMZD4w4bv-VsWcnqy75pVs

osm0sis commented 5 years ago

It doesn't help. The question is how does TWRP back it up successfully where neither dd or cat can.

Try nanddump -f /sdcard/boot-nand.img /dev/bootimg and upload me the file it creates please.

Msprg commented 5 years ago

Yeah, nanddump worked fine. bootimg-via-nanddump-MagiskPatched.zip

osm0sis commented 5 years ago

@topjohnwu, you may have to adapt to stream from nanddump and nandwrite from mtd-utils as fallbacks for cat/dd-style streaming to fully support nand devices.

@Msprg can you change the title of this issue to "magiskboot can't dump /dev/bootimg" ?

Msprg commented 5 years ago

@Msprg can you change the title of this issue to "magiskboot can't dump /dev/bootimg" ?

You're welcome!

XRevan86 commented 5 years ago

This looks just like https://github.com/topjohnwu/Magisk/issues/226.

osm0sis commented 5 years ago

Yup, it absolutely is. Looks like fixing this properly would close quite a number of issues.

topjohnwu commented 5 years ago

@osm0sis is there any way to detect whether the target file is nand or not? If there isn't a clean and easy way, unfortunately I won't be able to add a fix into Magisk since it won't be general.

osm0sis commented 5 years ago

@topjohnwu Not sure. I've never owned one to play around with it. Could you do it on a fallback basis? Either within magiskboot when the cat/dd-style stream dump fails use a new added nanddump-style stream dump, or even by script if magiskboot unpack exits with a return code to indicate the dump failed try with nanddump (from busybox).

Msprg commented 5 years ago

@osm0sis Just want to remind: Running cat or dd, waiting until it finishes, and then checking wether it has failed, is not exactly good idea, since these has not finished running, until they either succeeded or filled up all availible space. While it would work, in case of these failing, the Magisk installer would be running very long... By checking exit code of magiskboot unpack it would be much better way I think. Regardless which of these options you choose there is still possibility that some of the executed commands will fail the similiar way like the cat and dd did in my case, ran untill "no free space left" event forced them to stop. The best way in my opinion, would be to make custom program/(script?) for this purpose, with these eventualities in mind, so it fails instantly if it detect's that it is picking up some crap or got stuck in the loop. But It is not very realistic... is it?

Anyways please note, that I am not really a programmer, so I may be saying bullshit there...

osm0sis commented 5 years ago

I'm not talking about actual cat or dd. magiskboot does it itself and doesn't hang forever when it fails.

Msprg commented 5 years ago

Och, I thought that magiskboot is just script which is using some other applications from busybox. In that case then, use magiskboot and if that fails, try nanddump. (?)

osm0sis commented 5 years ago

@Msprg Nope, magiskboot is a binary, and yes, that's one of the options I already pitched to topjohnwu above.

osm0sis commented 5 years ago

@Msprg can you give me the output of ls -l /dev/bootimg in recovery?

Msprg commented 5 years ago

Sure I can:

~ # ls -l /dev/bootimg
__bionic_open_tzdata: couldn't find any tzdata when looking for localtime!
__bionic_open_tzdata: couldn't find any tzdata when looking for posixrules!
crw------- root     root     238,   9 2019-06-07 21:41 bootimg
~ # 

But something strange is happenning there, when I issue the same command, but from adb shell on Windows, I get one more error:

~ # ls -l /dev/bootimg
__bionic_open_tzdata: couldn't find any tzdata when looking for localtime!
__bionic_open_tzdata: couldn't find any tzdata when looking for GMT!
__bionic_open_tzdata: couldn't find any tzdata when looking for posixrules!
crw------- root     root     238,   9 2019-06-07 21:41 bootimg
~ # 

It may be nothing, just in case.

osm0sis commented 5 years ago

Awesome, so it's another character device issue, just like #1562, and @topjohnwu that's how you'll know when to access it with nanddump/write-style code. :+1:

nonnymoose commented 5 years ago

Wow, it's great to see that you're working on full MTD device support! (I figured I would just always manually patch and flash the boot image.) If you need any testing/debugging, please let me know. I'd be glad to help.

osm0sis commented 5 years ago

nanddump, flasherase and nandwrite will likely need to be adapted from mtd-tools: http://www.linux-mtd.infradead.org/

nonnymoose commented 5 years ago

I used nandwrite from busybox to flash a manually patched boot image and it bricked my device. I'm going to investigate later today.

Edit: never mind, looks like user error :man_shrugging:

osm0sis commented 5 years ago

Honestly I had a similar issue but with busybox nanddump not being trustworthy in the past. That's why proper mtd-utils source is the way to go.

Edit: Oh wait, it was busybox nandwrite causing trouble.. but only if I didn't flash_erase? Waayyy back: https://forum.xda-developers.com/showpost.php?p=47363488&postcount=45

GMMan commented 5 years ago

Are we talking about the char device backed by the dumchar driver that exposes the boot partition? My workaround has been to dd the partition out to a file, supply that as the boot image to the patcher, then write the patched boot image back to the dumchar device.

https://github.com/GMMan/Magisk/commit/4509ef74fe05dc21498e42c40812a276ac680d19

BTW, dumchar is a bit special in that it doesn't care about partition boundaries. It'll just keep reading past the end of the partition. I assume it stops when it reaches the end of storage, but in any case, that's why I pulled out the partition size from /proc/dumchar_info.

GMMan commented 5 years ago

@Msprg Is your device running on a MediaTek chip? If it is, then the reason you can't request superuser is because of MediaTek's special init. The failure is caused by missing BOOTCLASSPATH env var, where MediaTek init loads it after the Magisk component responsible for starting the prompt has launched. My current workaround is to bring it back to the top of init.rc where it usually is. Currently looking into a way to automate this.

nonnymoose commented 5 years ago

@GMMan Woah, I had no idea you could read directly from a device using the dumchar driver. I'm not sure I would recommend writing to it directly as in https://github.com/GMMan/Magisk/blob/4509ef74fe05dc21498e42c40812a276ac680d19/scripts/util_functions.sh#L288 because I don't know if that would result in the dumchar driver erasing an eraseblock multiple times though. (We really don't want to erase any blocks more than necessary, because if even one goes bad in the region that's used by the bootloader, your device is hard-bricked.)

@osm0sis I also verified that you could still use nandwrite/nanddump on /dev/bootimg just in case, and of course, you can. I recommend preferring this for at least writing since I couldn't find any documentation on whether the dumchar driver will erase an eraseblock immediately after a write. (If it did, and Magisk used a smaller block size when writing than the eraseblock size, then it would erase each eraseblock more than once.) It's highly likely that it does, though, because there's no way the driver could know if more data was going to be written to the device, so it would be most logical not to wait for data that might not even exist and to just write any data immediately.

GMMan commented 5 years ago

@nonnymoose Good point. I didn't pay too much attention to that because my device uses eMMC, so it should be taking care of wear leveling. The source code for the dumchar device is here. Looks like they might be operating directly on the MTD as a proxy, so using a proper tool for writing is well advised.

BTW, seems like for eMMC dumchar implements some sort of MTD interface, but using a writer designed for MTD seems extraneous if it's going to erase blocks before writing to it. You can find the device type and base device path by reading /proc/dumchar_info. Here's an example:

Part_Name       Size    StartAddr       Type    MapTo
preloader    0x0000000000880000   0x0000000000000000   2   /dev/misc-sd
mbr          0x0000000000080000   0x0000000000000000   2   /dev/block/mmcblk0
ebr1         0x0000000000080000   0x0000000000080000   2   /dev/block/mmcblk0p1
pro_info     0x0000000000300000   0x0000000000100000   2   /dev/block/mmcblk0
nvram        0x0000000000500000   0x0000000000400000   2   /dev/block/mmcblk0
protect_f    0x0000000000a00000   0x0000000000900000   2   /dev/block/mmcblk0p2
protect_s    0x0000000000a00000   0x0000000001300000   2   /dev/block/mmcblk0p3
seccfg       0x0000000000020000   0x0000000001d00000   2   /dev/block/mmcblk0
uboot        0x0000000000060000   0x0000000001d20000   2   /dev/block/mmcblk0
bootimg      0x0000000000600000   0x0000000001d80000   2   /dev/block/mmcblk0
recovery     0x0000000000600000   0x0000000002380000   2   /dev/block/mmcblk0
sec_ro       0x0000000000040000   0x0000000002980000   2   /dev/block/mmcblk0
misc         0x0000000000080000   0x00000000029c0000   2   /dev/block/mmcblk0
logo         0x0000000000300000   0x0000000002a40000   2   /dev/block/mmcblk0
expdb        0x0000000000a00000   0x0000000002d40000   2   /dev/block/mmcblk0
android      0x000000002bc00000   0x0000000003740000   2   /dev/block/mmcblk0p4
cache        0x0000000017800000   0x000000002f340000   2   /dev/block/mmcblk0p5
usrdata      0x000000004fa00000   0x0000000046b40000   2   /dev/block/mmcblk0p6
fat          0x000000004f340000   0x0000000096540000   2   /dev/block/mmcblk0p7
bmtpool      0x0000000001500000   0x00000000ff7700a8   2   /dev/block/mmcblk0
Part_Name:Partition name you should open;
Size:size of partition
StartAddr:Start Address of partition;
Type:Type of partition(MTD=1,EMMC=2)
MapTo:actual device you operate
Msprg commented 5 years ago

Okay guys, just wanting to ask about the progress of this Issue, and if you need me to test something or anything like that...

osm0sis commented 5 years ago

Looks like I had overlooked that there is a flash_erase utility in busybox, it just isn't enabled by default with the other nand* utils.

I'm going to look into supporting this with script once we get all 3 utils built into Magisk's busybox. :+1:

Jenmet-cy commented 4 years ago

@Msprg Is your device running on a MediaTek chip? If it is, then the reason you can't request superuser is because of MediaTek's special init. The failure is caused by missing BOOTCLASSPATH env var, where MediaTek init loads it after the Magisk component responsible for starting the prompt has launched. My current workaround is to bring it back to the top of init.rc where it usually is. Currently looking into a way to automate this.

Hi. I'm sorry, @GMMan, but could you elaborate on this? I also own a MediaTek device, and I'm facing the same issue as Msprg after manually flashing the modified boot.img. I've compared my device's init.rc and a Qualcomm device, and I noticed that MediaTek's init has three entries (for user, userdebug and eng builds, respectively) instead of one as the Qualcomm's file, but I don't know how I should proceed. I'll attach it if you happen to find it useful. init.zip

GMMan commented 4 years ago

@Jenmet-cy Copy line 30 of your init script to the block of exports around line 50.

Jenmet-cy commented 4 years ago

@Jenmet-cy Copy line 30 of your init script to the block of exports around line 50.

Thanks for your help. Unfortunately, copying line 30 around line 50 causes the device to not boot. I also tried deleting the rest of the lines and formatting it like the init.rc of my other device, but that didn't work either.

GMMan commented 4 years ago

Around means somewhere in that block of exports, not at that line.

Jenmet-cy commented 4 years ago

Sorry for the late reply. Yes, I get what you mean. Here's how I modified the init.rc file. The first is like you told me to do it (copying the BOOTCLASSPATH line around the "Setup global environment" variables). The second one deletes the BOOTCLASSPATH lines at the start of the file and just leaves the one on "Setup global environment". Unfortunately, none of them work. init-examples.zip