skylot / jadx

Dex to Java decompiler
Apache License 2.0
40.86k stars 4.8k forks source link

Break From Labeled Block, Inconsitent Code error on decompile #22

Open riboribo opened 9 years ago

riboribo commented 9 years ago

A break from within a labeled block decompiles incorrectly and gives Missing Block and Inconsistent Code errors. I think the problem is near jadx/core/dex/visitors/regions/IfMakerHelper.java restructureIf() where it decides 'badThen' and 'badElse' for the if statement with the break in it. I don't understand what allPathsFromIf() is doing or what setOutBlock() does/means, but I think it is deciding that the 'then' and 'else' blocks are nops and ignore the fact that the IF instruction has the effect of breaking out of an enclosing block. (Not beautiful test case code, but I think it is the sort of thing I'm encountering in some real subject code in Android system code.

Test case:

package ribo;

import android.app.Activity;
public class LabelBrk extends Activity {

public static int a=1, b=2, c=8, d=9;
public void dBreakFromBlock() {
    //prt("before");
    label: {
        //prt("top of block");
        if (d == 10) {
            prt("outer if top");
            if (a >= b) break label;
        }
        prt("after cond break");
     }
     prt("afterBlock");
  }
  public static void prt(String str) { System.out.println(str); }
} // end of LabelBrk class

Console output during decompile:

09:16:03.494 [main] INFO  jadx.api.JadxDecompiler - loading ...  
09:16:03.644 [main] DEBUG jadx.api.JadxDecompiler - processing threads count: 8
09:16:03.644 [main] INFO  jadx.api.JadxDecompiler - processing ...
09:16:03.784 [pool-1-thread-5] DEBUG j.c.d.visitors.regions.IfMakerHelper - Stop processing blocks after 'if': B:3:0x000f, method: ribo.LabelBrk.dBreakFromBlock():void
09:16:03.784 [pool-1-thread-5] DEBUG j.c.d.visitors.regions.IfMakerHelper - Stop processing blocks after 'if': B:3:0x000f, method: ribo.LabelBrk.dBreakFromBlock():void
09:16:03.794 [pool-1-thread-5] DEBUG j.c.d.visitors.regions.CheckRegions -  Missing block: B:3:0x000f in ribo.LabelBrk.dBreakFromBlock():void
09:16:03.794 [pool-1-thread-5] ERROR jadx.core.utils.ErrorsCounter - Inconsistent code in method: ribo.LabelBrk.dBreakFromBlock():void 
09:16:03.804 [main] ERROR jadx.core.utils.ErrorsCounter - 1 errors occurred in following nodes:
09:16:03.804 [main] ERROR jadx.core.utils.ErrorsCounter -   Method: ribo.LabelBrk.dBreakFromBlock():void
09:16:03.804 [main] ERROR jadx.cli.JadxCLI - finished with errors

Instructions decoded, grouped by block:

b0     0 SGET r0 int  static d
b0     2 CONST r1 unknownType < a0 lit 10 

b1     4 IF  < a0 r0 unknownType, a1 r1 unknownType  NE 17 then: null else: null

b2     6 CONST_STR r0 String   str: "outer if top"
b2     8 INVOKE  < a0 r0 String  call prt
b2     B SGET r0 int  static a
b2     D SGET r1 int  static b

b3     F IF  < a0 r0 unknownType, a1 r1 unknownType  LT 17 then: null else: null

b4    11 CONST_STR r0 String   str: "afterBlock"
b4    13 INVOKE  < a0 r0 String  call prt

b5    16 RETURN   

b6    17 CONST_STR r0 String   str: "after cond break"
b6    19 INVOKE  < a0 r0 String  call prt
b6    1C GOTO   goto 11

The 'IF' in block 3, at address F above is the one that breaks from the labeled block. Block 3 gets elminated (I think in the RegionMaker pass) Block tree after RegionMaker visitor:

{ reg 5012645
BLK B:0:0x0000
{ reg 12
    BLK B:1:0x0004
    18   b1     4 IF  < a0 wrap 0x0000: SGET  (r0_0 int) =   ribo.LabelBrk.d intarg??arg??, a1 lit 10  EQ 6 then: B:2:0x0006 else: B:6:0x0017
    { reg 12
    BLK B:2:0x0006
    19   b2     8 INVOKE  < a0 wrap 0x0006: CONST_STR  (r0_1 java.lang.String) =  "outer if top"arg??arg??  call prt
    }
}
BLK B:6:0x0017
22   b6    19 INVOKE  < a0 wrap 0x0017: CONST_STR  (r0_4 java.lang.String) =  "after cond break"arg??arg??  call prt
BLK B:4:0x0011
24   b4    13 INVOKE  < a0 wrap 0x0011: CONST_STR  (r0_3 java.lang.String) =  "afterBlock"arg??arg??  call prt
BLK B:5:0x0016
25   b5    16 RETURN   
}
skylot commented 9 years ago

Thanks for your detailed description. Actually now jadx don't support labeled block (I just recently done 'break' support for loops). There are no quick fix, so implementation will take some time.

Also you don't really need to describe bytecode structure in issue (it looks like you spend a lot of time) only java code is enough. You can generate 'dot' graphs using --cfg or --raw-cfg jadx command line arguments: 2014-10-21_22 11 11_scrot

riboribo commented 9 years ago

Thanks for looking into it. I did spend a lot of time looking into the code to see what it was doing, I wrote some debug display routines to show me the various states of processing and what information the system had a various stages. The code looks nicely designed -- some other (java) decompilers are excessively complicated or fussy. However!!, comments describing what the code is trying to do would be very helpful. I did see the --cfg options, nice touch, but I didn't have a good .dot file displayer and I wanted to see what information was in the data structures at different points in the process.

I want to be able to decompile some code -- and recompile it again to modify some apps. So I need to decompile whatever is in app, so I need decompiling for whatever the code is. (I had looking into various tools that convert apps to smali code, but I wanted to deal with things as Java, esp for debugging in Eclipse) I don't know that these use breaks from labelled blocks, but that was some code I had that produced the same error in jadx.

I'll poke around some more to see if I can find a fix that will produce some reasonable code for that situation.

On Tue, Oct 21, 2014 at 2:36 PM, skylot notifications@github.com wrote:

Thanks for your detailed description. Actually now jadx don't support labeled block (I just recently done 'break' support for loops). There are no quick fix, so implementation will take some time.

Also you don't really need to describe bytecode structure in issue (it looks like you spend a lot of time) only java code is enough. You can generate 'dot' graphs using --cfg or --raw-cfg jadx command line arguments: [image: 2014-10-21_22 11 11_scrot] https://cloud.githubusercontent.com/assets/118523/4724173/624c7814-594e-11e4-9275-fd6b38b26fd2.png

Reply to this email directly or view it on GitHub https://github.com/skylot/jadx/issues/22#issuecomment-59975828.

Bob Flavin

riboribo commented 9 years ago

I think I found a solution to the problem breaks from other than the innermost loop/block (ie labeled breaks), Incorrect decompiled code is generated when one of these unexpected flow of control transfers occurs. The symptom of this is the flow graph is that some 'nodes' appear more than once in the graph. Because the break in the flow was not being recognized, blocks in the graph outside the outermost relevant region( '{' block) were appearing in the graph more than once. As is this flow graph (sorry, its not the nice 'dot' graph). Note that blocks B:5 and B:6 appear in regX after block B:4 -- and also in regT after B:7.

{ regT (b0(b2(b4b5b6))b7b5b6) Region,  5 blocks
  BLK B:0:0x0000 p: s1/1:b1,, attrs: A:{SPENT}
  { regU (b2(b4b5b6)) IfRegion,  2 blocks
    BLK B:1:0x000e p:b0, s2/2:b2,b7,, attrs: A:{SKIP, SPENT}
    { regV (b2(b4b5b6)) Region,  2 blocks
      BLK B:2:0x0010 p:b1, s1/1:b3,, attrs: A:{SPENT}
      { regW (b4b5b6) IfRegion,  2 blocks
        BLK B:3:0x0019 p:b2, s2/2:b4,b7,, attrs: A:{SKIP, SPENT}
        { regX (b4b5b6) Region,  3 blocks
          BLK B:4:0x001b p:b3, s1/1:b5,, attrs: A:{SPENT}
          BLK B:5:0x0020 p:b4,b7, s1/1:b6,, attrs: A:{SPENT}
          BLK B:6:0x0025 p:b5, s0/0:, attrs: A:{RETURN, SPENT}
        } regX
      } regW
    } regV
  } regU
  BLK B:7:0x0026 p:b1,b3, s1/1:b5,, attrs: A:{SPENT}
  --blk B:5:0x0020 already used, goodBreakFromBlock
  BLK B:5:0x0020 p:b4,b7, s1/1:b6,, attrs: A:{SPENT}
  --blk B:6:0x0025 already used, goodBreakFromBlock
  BLK B:6:0x0025 p:b5, s0/0:, attrs: A:{RETURN, SPENT}
} regT

I wrote a new 'visitor' pass called jadx.core.dex.visitors.regions.BreakFinder which detects the duplicates and replaces all but the first occurrence of the duplicated sets of blocks with a InsnType.BREAK with a LabelAttr -- and finds or creates an appropriate 'Region' immediately before the last instance of the set of duplicated blocks. In the example below, region was created with the label 'block5: that contained the blocks and regions immediately before the 'prt("after labelled block");' line and contains the region in which the 'break block5;' line was inserted.

public void goodBreakFromBlock() {
  block5: {
    prt("before label");
    prt("G after label, before second if");
    if (d == 10) {
      prt("G before second if");
      if (a >= b) {
        prt("non empty then, before break");
        break block5;
      }
    }
    prt("inside label block, after ifs");
  }
  prt("after labelled block");
} // end of goodBreakFromBlock

Note: The 'prt("before label"); statement was actually before the labeled '{' block in the original source code. There is no way to detect from the compiled code exactly where that labeled block originally started; however the decompiled code is functionally equivalent and correct

I need to do some more testing to verify it in a bunch of other cases. I may not have done this in the best way as I was just finding my way around the jadx code.

I also disabled some of the code in IfMakerHelper.restructureIf() if both 'badThen' and 'badElse' where true (to prevent it from issuing: LOG.debug("Stop processing blocks after 'if': {}, method: {}", info.getIfBlock(), mth); and returning null). I don't understand why restructureIf thought that was bad (perhaps because of one of these labeled breaks causing an unexpected flow. In one of my test cases valid compiled code caused this error.

Also, fixed something in SimplifyVisitor.convertInvoke handling StringBuilder .append() .toString(). The code expected the new StringBuilder() constructor to be called with no args. The compiled code I had uses new StringBuilder("first string frag") ... so the StringBuilder constructor is the second entry in 'chain' and the first entry is a wrapped instruction what loads the CONST_STR arg.

(Oops, didn't mean to be closing and opening this issue.)

skylot commented 9 years ago

Great! Your approach looks correct, waiting for pull request from you. Also please check the code in RegionMaker where I already insert 'break' and 'continue' for loops, maybe it is better to extract and fix this for work with your new visitor.

About restructureIf() method: In some cases I failed to found common algorithm for process code flow patterns and just process only known ones, once I collects enough patterns I can deduce the common algorithm. So on new patterns this method just stop processing and you can change it as you want.

Thank you again for your help.

riboribo commented 9 years ago

(I committed an update to SimpifiyVisitor.java just to test out how to update things in GIT. The version of code I've been testing on is a somewhat out of date (I copied it off a GIT repository clone) I'm now moving some of the updates to the current GIT repository)

riboribo commented 9 years ago

I'm trying to commit and push some changes to jadx for the labeled break issue -- also some other things along the way, like an enhancement to SimplifyVisitor to improve conversion of StringBuilder(xxx).append().append().toString() to string concatentation.

(I'm used to CVS and Eclipse and confused by Git) I have Egit in Eclipse and tried some commits and pushes and they seem to have no effect on the github master. I got a 'git-receive-pack not permitted' error when trying to push. Does this mean I'm not authorized and need to get them to you in some other way, or that I should make a branch that I 'own' that can be later merged?

Eclipse seems to say I have ^2 and v5 I think that means 2 commits/pushes pending and 5 pull/updates pending (with some conflicts)

I seem to be getting behind on the pulls (getting stuff from the master) after I manually merged my changes with the master.

On Tue, Nov 18, 2014 at 2:32 AM, skylot notifications@github.com wrote:

Great! Your approach looks correct, waiting for pull request from you. Also please check the code in RegionMaker where I already insert 'break' and 'continue' for loops, maybe it is better to extract and fix this for work with your new visitor.

About restructureIf() method: In some cases I failed to found common algorithm for process code flow patterns and just process only known ones, once I collects enough patterns I can deduce the common algorithm. So on new patterns this method just stop processing and you can change it as you want.

Thank you again for your help.

Reply to this email directly or view it on GitHub https://github.com/skylot/jadx/issues/22#issuecomment-63432310.

Bob Flavin

skylot commented 9 years ago

You can't directly push to jadx master repo, you need to fork repository and push your commits to your repository and after make pull request. There are many tutorials how to work with github, for example: https://help.github.com/articles/fork-a-repo/ and https://help.github.com/articles/using-pull-requests/

riboribo commented 9 years ago

Ah, I see Git has a somewhat different perspective from CVS. I'll fork. I read some tutorials, but with the wrong mental model of the repos, they are confusing.

I when integrating the labelledBreak/continue solution into RegionMaker, rather than as its own visitor (Which I called BreakFinder) I think that rather than fixing 'duplicate blocks', that they can be prevented in the first place (in RegionMaker.traverse, by remembering the 'parentRegion' of every block or region and removing the block/region from the 'old' region when an request is made to add it to another region.

bagipro commented 5 years ago

@skylot I get the following code right now:

    /* JADX WARNING: Code restructure failed: missing block: B:3:0x000f, code lost:
        if (f14a >= f15b) goto L_0x0017;
     */
    public void dBreakFromBlock() {
        if (f17d == 10) {
            prt("outer if top");
        }
        prt("after cond break");
        prt("afterBlock");
    }

APK: app-release.apk.zip