Lanchon / REPIT

A Device-Only Data-Sparing Repartitioning Tool For Android
162 stars 26 forks source link

Illegal instruction [ERROR 132] #23

Closed WizARTs closed 8 years ago

WizARTs commented 8 years ago

GT-I9100 (Dorimanxx kernel) + TWRP 3.0 by Arnab lanchon-repit-20160328-system=same-data=3.0-sdcard=max-preload=min+wipe-i9100.zip While trying to install, have got an error: Illegal instruction [ERROR 132]. What does it mean?

recovery_log.txt lanchon-repit_log.txt

Lanchon commented 8 years ago

hi, thanks!

what kernel? arnab's TWRP? i need a logfile.

Jeroen0494 commented 8 years ago

Hi, I have the same problem with my GT-I9100, TWRP 2.8.7.0 with this file, repid with this file. Note to WizARTs, TWRP3 is NOT compatible with Cyanogenmod 12.1.

I am trying to upgrade from Cyanogenmod 12.1 to 13. I followed the guide here.

This is the error:

ILLIGAL INSTRUCTION

[ERROR 132]

E: Error executing updater binary in zip /sdcard/lanchon[...].zip
Error flashing zip '/sdcard/...'
Updating partition details...
...Done

Steps to reproduce:

  1. Reboot fully working Cyanogenmod 12.1 with Gapps in recovery mode.
  2. Flash the TWRP zip.
  3. Reboot recovery (or systems, works fine, thank you TWRP for Nandroid!).
  4. Flash RePID zip.
  5. Receive this error.

TWRP recovery log: recovery.log.zip

REPID log (no so useful): lanchon-repit.log.zip

WizARTs commented 8 years ago

Updated my first comment

Lanchon commented 8 years ago

thanks!

twrp3 is compatible with dorimanx.

twrp3 is NOT compatible with CM 12.1-based kernels!!!!

it looks like the binaries bundled are not compatible with dorimanx, dont know way.

temporarily flash my cm 12.1 (no twrp3) or 13 kernel, repit, then reflash your old kernel.

please report back afterwards. thanks!

WizARTs commented 8 years ago

Hi! Have tried to install with your kernel CM13.

FATAL: partition #12: not found
[ERROR 1]

Partition #12 - is a "preload" partition, that I've already rePITed a long time ago to minimal size, using another PIT file. Now I just want to increase my "data" part. Also I've tried not to put the preload string to config filename - useless. What can I do now? recovery_log.txt

Lanchon commented 8 years ago

your phone is missing partition 12 (yes, preload). REPIT does not support your phone: it detects the issue and correctly refuses to run.

to use REPIT, first restore the partition somehow.

alternatively, you could try removing the check from REPIT. go here https://github.com/Lanchon/REPIT/blob/master/repit.sh#L379 and edit line: for n in $(seq 1 $partitionCount); do to: for n in $(seq 1 11); do then use -preload=min+wipe

the script is called 'script' inside the flashable zip file.

this is COMPLETELY UNSUPPORTED.

Lanchon commented 8 years ago

please close this issue if no further progress will be done.

0xAF commented 8 years ago

Hi, trying to REPIT my i9100 and got the Illegal Instruction. Running CM 12.1 with kernel: kernel-Lanchon-TRIM-IsoRec-20160206-cm-12.1-i9100.zip TWRP is from: recovery-Lanchon-IsoRec-TWRP-2.8.7.0-20160113-i9100-(by-arnab).zip Repit file: lanchon-repit-20160328-system=1.0-data=4.0-sdcard=max-preload=min+wipe-i9100.zip

On screen log says: E: Error executing updater binary in zip '.....(repit filename).....'

the repit log from /sdcard:

/sdcard # cat lanchon-repit.log 
Illegal instruction

[ERROR 132]

recovery.log: http://sprunge.us/hSVQ

NOTE: the REPIT from 20160317 is working.

Lanchon commented 8 years ago

well, the older repit needs exactly that recovery you mention.

the new repit bundles a recovery environment and a few extra tools. it uses it to run the script.

the environment is borrowed from TWRP 2. TWRP -unlike all other efforts that i know of- is a dynamically linked recovery. it comes with a binary -that i'm no expert but i assume must be statically linked- called linker/linker64 whose job is to load the *.so shared object files referenced by the dynamically linked binaries and link the whole stuff whenever you load a program.

the linker is usually never invoked directly, but you certainly can. just load TWRP, adb shell, and type 'linker'.

for an unknown reason, the linker that comes with TWRP crashes when run in some other recovery environments, and that's the error you see. it's almost as it had ties with certain kernel features that change, but for all i know, there is a very strong push to make linux userland binary compatible with all linux kernel versions, so this doesn't make much sense (unless the linker is considered an exception). so to say the least this is unexpected, and i might need to scrap all the work i did recently on flashize once i find out what the hell is going on, and that sucks.

the workaround is of course using a different kernel. i'm using my latest CM 12.1 trim kernel, so i know that works.

the real solution might be: -scrap flashize-env completely. -require any TWRP to run (after all my struggle with flashize-env to universalize REPIT, CM recovery turned out usless due to other factors anyway). -make repit compatible with TWRP 3 without loosing compatibility with TWRP 2 (requires sensing the environment and switching strategies for certain functions). -dont bundle an environment, just bundle certain statically-linked disk tools: basically, everything that is not busybox.

i'm just being terribly lazy about this for now, flashize-env took a lot of work.

0xAF commented 8 years ago

From your description, the linker/linker64 seems to be the ld.so helper (which is in normal linux), but this one is for the embedded android environment. So it's mandatory to be statically linked for obvious reason. If your dynamic linked binaries have a missing .so files or .so files compiled for another architecture or another instruction set, hence giving 'Illegal instruction', the obvious solution would be to use static linked binaries of your tools.

But I'm not sure the problem is in the dynamic linked binaries or their .so libraries.

I didn't started your binaries to see where they crash with "Illegal instruction". Or at least to see which binary is the crashing one.

I already upgraded my SGS2 to CM13 and I'm not sure if I can reproduce the problem anymore. I should try later or tomorrow.

If I call the script with this filename:

lanchon-repit-20160328-system=same-data=same-sdcard=same-preload=same-i9100.zip

would it be sufficient for NOOP? I do not want to resize any partitions anymore.

Lanchon commented 8 years ago

would it be sufficient for NOOP?

yes, that would be NOP; same as this: lanchon-repit-20160328-i9100.zip which only checks, trims, and grows filesystems to cover their entire partitions if necessary.

From your description, the linker/linker64 seems to be the ld.so helper (which is in normal linux), but this one is for the embedded android environment. So it's mandatory to be statically linked for obvious reason.

yes, it's equivalent to ld-linux.so.* but seems to be a smaller reimplementation. well actually no, it could be dynamically linked, albeit partially. that binary could be just statically-linked enough to do some basic dynamic linking, then on runtime link some *.so libs to itself (yes, there's standard api for that in linux) to add more advanced functionality in a sort of self-bootstrapping process.

no idea how ld.so works, i'm just pointing out the possibility.

If your dynamic linked binaries have a missing .so files or .so files compiled for another architecture or another instruction set, hence giving 'Illegal instruction', the obvious solution would be to use static linked binaries of your tools. they don't. it's not the .so files, it's the linker: just execute it stand-alone and it borks out.

also, the same linker executable on the same hardware works fine under some distro of TWRP, but fails on some other. i'm talking about the same hardware and the same TWRP version here, just different distros. it's as if some build flags affect cause the issue, and it certainly looks like kernel flags to me.

really, for now i have no idea of what is happening. i should dump different versions of linkers and then could probably find out. but i just didn't have the time.

0xAF commented 8 years ago

After digging a bit into the zip file:

The /flashize/env/sbin/linker file is not statically linked, it is dynamically linked. So it's the ld.so equivalent, but somehow it runs. I wonder what kind of magic is that, since the dynamic helper for /sbin/linker is the /sbin/linker itself.

Given the above lines (about the linker magic, which is probably a kernel magic somehow) and after your last comment, it sounds like the linker is somehow compiled from or with the kernel and it depends on the kernel compilation somehow...

Can you use the already provided /sbin/linker instead. So basically in the env-setup/init script you can change this check:

for linker in linker linker64; do
    if [ -f "env/sbin/$linker" ] && [ ! -f "/sbin/$linker" ]; then
        ln -sf "$base/env/sbin/$linker" /sbin/
    fi
done

to actually check if the system already have a linker somewhere... so use the "working" one instead of using your from the zip, which could be crashing.

EDIT: I just read your edited comment, which explains the dynamic linker binary. The ld.so (ld-linux.so) in normal Linux is usually statically linked, so far I do not know of dynamically linked one. Though as it seems, it should be possible as per your comment.

EDIT2: Just a reference about the linker, here is more information: https://chromium.googlesource.com/android_tools/+/master/ndk/sources/android/crazy_linker/DESIGN.TXT

Lanchon commented 8 years ago

The /flashize/env/sbin/linker file is not statically linked, it is dynamically linked. So it's the ld.so equivalent, but somehow it runs.

wow, lol!! i imagined this was possible, but never expected it. running ldd /lib/ld-linux.so.2 on my linux box says 'statically linked' which of course is no surprise, as the dyn linking -if any- must happen later than usual for this binary. (or else, infinite loop into hell... :) ) how did you find out that linker is not static?

to actually check if the system already have a linker somewhere... well, that line actually checks! if /sbin/linker already exists, it will use that.

so when ran under twrp, it's not linker that throws the illegal instruction (native linker is used), it's busybox or something else. basically any binary will throw under some circumstances, i'm trying to know why.

0xAF commented 8 years ago

how did you find out that linker is not static?

Since I cannot use the ldd for ARM binary obviously, I've used 'file' tool:

╾[*]─[af@core]─[/tmp]─[$]╼ file linker 
linker: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /sbin/linker, not stripped

BUT, since you mentioned it, I checked what it gives for my ld.so

╾[*]─[af@core]─[/tmp]─[$]╼ file /lib/ld-2.21.so
/lib/ld-2.21.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped
╾[*]─[af@core]─[/tmp]─[$]╼ ldd /lib/ld-2.21.so
        statically linked

So as it seems it's not a reliable way to check if it's statically or dynamically linked. Though it has information for the helper to load the .so files. But this information could be available in the static binaries too, dunno...

EDIT: output from the readelf tool:

╾[x]─[af@core]─[/tmp]─[$]╼ readelf -h linker 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0xa18
  Start of program headers:          52 (bytes into file)
  Start of section headers:          54040 (bytes into file)
  Flags:                             0x5000000, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         27
  Section header string table index: 24

╾[*]─[af@core]─[/tmp]─[$]╼ readelf -h /lib/ld-linux-x86-64.so.2 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0xc80
  Start of program headers:          64 (bytes into file)
  Start of section headers:          147536 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         20
  Section header string table index: 19

It seems the readelf believes that both linkers are dynamic, so it's not reliable to count on this flag only. The reliable way should be ldd

Lanchon commented 8 years ago

yep, those tools probably cannot handle the special case of linker correctly.

interesting for sure, but a little OT now: it must be an arch thing. grabbing the environment from the TWRP that fails to run this would probably fix for all TWRPs, but i can't consider this fixed without a honest fix. i probably need to rewrite all of flashize.

0xAF commented 8 years ago

Yeah, sorry for that. On the second though, I just realized what you meant for the linker to use dynamic libraries. Actually it should be statically linked by the compiler's linker. Later on runtime it can call dlopen to load runtime libs, but still it should be a static binary.

Let me know if I can be of any further help.

Lanchon commented 8 years ago

new release should fix this issue!!! https://www.androidfilehost.com/?w=files&flid=53660

WizARTs commented 8 years ago

Thank you! Have already repartitioned my device, using previous release: 1) restoring stock PIT; 2) applying your tool without any errors!

Jeroen0494 commented 8 years ago

I finally got around to 'repitting' my device, the new release works flawless. Thanks!