Open riboribo opened 9 years ago
Thanks for your detailed description. Actually now jadx don't support labeled block (I just recently done 'break' support for loops). There are no quick fix, so implementation will take some time.
Also you don't really need to describe bytecode structure in issue (it looks like you spend a lot of time) only java code is enough. You can generate 'dot' graphs using --cfg
or --raw-cfg
jadx command line arguments:
Thanks for looking into it. I did spend a lot of time looking into the code to see what it was doing, I wrote some debug display routines to show me the various states of processing and what information the system had a various stages. The code looks nicely designed -- some other (java) decompilers are excessively complicated or fussy. However!!, comments describing what the code is trying to do would be very helpful. I did see the --cfg options, nice touch, but I didn't have a good .dot file displayer and I wanted to see what information was in the data structures at different points in the process.
I want to be able to decompile some code -- and recompile it again to modify some apps. So I need to decompile whatever is in app, so I need decompiling for whatever the code is. (I had looking into various tools that convert apps to smali code, but I wanted to deal with things as Java, esp for debugging in Eclipse) I don't know that these use breaks from labelled blocks, but that was some code I had that produced the same error in jadx.
I'll poke around some more to see if I can find a fix that will produce some reasonable code for that situation.
On Tue, Oct 21, 2014 at 2:36 PM, skylot notifications@github.com wrote:
Thanks for your detailed description. Actually now jadx don't support labeled block (I just recently done 'break' support for loops). There are no quick fix, so implementation will take some time.
Also you don't really need to describe bytecode structure in issue (it looks like you spend a lot of time) only java code is enough. You can generate 'dot' graphs using --cfg or --raw-cfg jadx command line arguments: [image: 2014-10-21_22 11 11_scrot] https://cloud.githubusercontent.com/assets/118523/4724173/624c7814-594e-11e4-9275-fd6b38b26fd2.png
Reply to this email directly or view it on GitHub https://github.com/skylot/jadx/issues/22#issuecomment-59975828.
Bob Flavin
I think I found a solution to the problem breaks from other than the innermost loop/block (ie labeled breaks), Incorrect decompiled code is generated when one of these unexpected flow of control transfers occurs. The symptom of this is the flow graph is that some 'nodes' appear more than once in the graph. Because the break in the flow was not being recognized, blocks in the graph outside the outermost relevant region( '{' block) were appearing in the graph more than once. As is this flow graph (sorry, its not the nice 'dot' graph). Note that blocks B:5 and B:6 appear in regX after block B:4 -- and also in regT after B:7.
{ regT (b0(b2(b4b5b6))b7b5b6) Region, 5 blocks
BLK B:0:0x0000 p: s1/1:b1,, attrs: A:{SPENT}
{ regU (b2(b4b5b6)) IfRegion, 2 blocks
BLK B:1:0x000e p:b0, s2/2:b2,b7,, attrs: A:{SKIP, SPENT}
{ regV (b2(b4b5b6)) Region, 2 blocks
BLK B:2:0x0010 p:b1, s1/1:b3,, attrs: A:{SPENT}
{ regW (b4b5b6) IfRegion, 2 blocks
BLK B:3:0x0019 p:b2, s2/2:b4,b7,, attrs: A:{SKIP, SPENT}
{ regX (b4b5b6) Region, 3 blocks
BLK B:4:0x001b p:b3, s1/1:b5,, attrs: A:{SPENT}
BLK B:5:0x0020 p:b4,b7, s1/1:b6,, attrs: A:{SPENT}
BLK B:6:0x0025 p:b5, s0/0:, attrs: A:{RETURN, SPENT}
} regX
} regW
} regV
} regU
BLK B:7:0x0026 p:b1,b3, s1/1:b5,, attrs: A:{SPENT}
--blk B:5:0x0020 already used, goodBreakFromBlock
BLK B:5:0x0020 p:b4,b7, s1/1:b6,, attrs: A:{SPENT}
--blk B:6:0x0025 already used, goodBreakFromBlock
BLK B:6:0x0025 p:b5, s0/0:, attrs: A:{RETURN, SPENT}
} regT
I wrote a new 'visitor' pass called jadx.core.dex.visitors.regions.BreakFinder which detects the duplicates and replaces all but the first occurrence of the duplicated sets of blocks with a InsnType.BREAK with a LabelAttr -- and finds or creates an appropriate 'Region' immediately before the last instance of the set of duplicated blocks. In the example below, region was created with the label 'block5: that contained the blocks and regions immediately before the 'prt("after labelled block");' line and contains the region in which the 'break block5;' line was inserted.
public void goodBreakFromBlock() {
block5: {
prt("before label");
prt("G after label, before second if");
if (d == 10) {
prt("G before second if");
if (a >= b) {
prt("non empty then, before break");
break block5;
}
}
prt("inside label block, after ifs");
}
prt("after labelled block");
} // end of goodBreakFromBlock
Note: The 'prt("before label"); statement was actually before the labeled '{' block in the original source code. There is no way to detect from the compiled code exactly where that labeled block originally started; however the decompiled code is functionally equivalent and correct
I need to do some more testing to verify it in a bunch of other cases. I may not have done this in the best way as I was just finding my way around the jadx code.
I also disabled some of the code in IfMakerHelper.restructureIf() if both 'badThen' and 'badElse' where true (to prevent it from issuing: LOG.debug("Stop processing blocks after 'if': {}, method: {}", info.getIfBlock(), mth); and returning null). I don't understand why restructureIf thought that was bad (perhaps because of one of these labeled breaks causing an unexpected flow. In one of my test cases valid compiled code caused this error.
Also, fixed something in SimplifyVisitor.convertInvoke handling StringBuilder .append() .toString(). The code expected the new StringBuilder() constructor to be called with no args. The compiled code I had uses new StringBuilder("first string frag") ... so the StringBuilder constructor is the second entry in 'chain' and the first entry is a wrapped instruction what loads the CONST_STR arg.
(Oops, didn't mean to be closing and opening this issue.)
Great! Your approach looks correct, waiting for pull request from you. Also please check the code in RegionMaker where I already insert 'break' and 'continue' for loops, maybe it is better to extract and fix this for work with your new visitor.
About restructureIf() method: In some cases I failed to found common algorithm for process code flow patterns and just process only known ones, once I collects enough patterns I can deduce the common algorithm. So on new patterns this method just stop processing and you can change it as you want.
Thank you again for your help.
(I committed an update to SimpifiyVisitor.java just to test out how to update things in GIT. The version of code I've been testing on is a somewhat out of date (I copied it off a GIT repository clone) I'm now moving some of the updates to the current GIT repository)
I'm trying to commit and push some changes to jadx for the labeled break issue -- also some other things along the way, like an enhancement to SimplifyVisitor to improve conversion of StringBuilder(xxx).append().append().toString() to string concatentation.
(I'm used to CVS and Eclipse and confused by Git) I have Egit in Eclipse and tried some commits and pushes and they seem to have no effect on the github master. I got a 'git-receive-pack not permitted' error when trying to push. Does this mean I'm not authorized and need to get them to you in some other way, or that I should make a branch that I 'own' that can be later merged?
Eclipse seems to say I have ^2 and v5 I think that means 2 commits/pushes pending and 5 pull/updates pending (with some conflicts)
I seem to be getting behind on the pulls (getting stuff from the master) after I manually merged my changes with the master.
On Tue, Nov 18, 2014 at 2:32 AM, skylot notifications@github.com wrote:
Great! Your approach looks correct, waiting for pull request from you. Also please check the code in RegionMaker where I already insert 'break' and 'continue' for loops, maybe it is better to extract and fix this for work with your new visitor.
About restructureIf() method: In some cases I failed to found common algorithm for process code flow patterns and just process only known ones, once I collects enough patterns I can deduce the common algorithm. So on new patterns this method just stop processing and you can change it as you want.
Thank you again for your help.
Reply to this email directly or view it on GitHub https://github.com/skylot/jadx/issues/22#issuecomment-63432310.
Bob Flavin
You can't directly push to jadx master repo, you need to fork repository and push your commits to your repository and after make pull request. There are many tutorials how to work with github, for example: https://help.github.com/articles/fork-a-repo/ and https://help.github.com/articles/using-pull-requests/
Ah, I see Git has a somewhat different perspective from CVS. I'll fork. I read some tutorials, but with the wrong mental model of the repos, they are confusing.
I when integrating the labelledBreak/continue solution into RegionMaker, rather than as its own visitor (Which I called BreakFinder) I think that rather than fixing 'duplicate blocks', that they can be prevented in the first place (in RegionMaker.traverse, by remembering the 'parentRegion' of every block or region and removing the block/region from the 'old' region when an request is made to add it to another region.
@skylot I get the following code right now:
/* JADX WARNING: Code restructure failed: missing block: B:3:0x000f, code lost:
if (f14a >= f15b) goto L_0x0017;
*/
public void dBreakFromBlock() {
if (f17d == 10) {
prt("outer if top");
}
prt("after cond break");
prt("afterBlock");
}
APK: app-release.apk.zip
A break from within a labeled block decompiles incorrectly and gives Missing Block and Inconsistent Code errors. I think the problem is near jadx/core/dex/visitors/regions/IfMakerHelper.java restructureIf() where it decides 'badThen' and 'badElse' for the if statement with the break in it. I don't understand what allPathsFromIf() is doing or what setOutBlock() does/means, but I think it is deciding that the 'then' and 'else' blocks are nops and ignore the fact that the IF instruction has the effect of breaking out of an enclosing block. (Not beautiful test case code, but I think it is the sort of thing I'm encountering in some real subject code in Android system code.
Test case:
Console output during decompile:
Instructions decoded, grouped by block:
The 'IF' in block 3, at address F above is the one that breaks from the labeled block. Block 3 gets elminated (I think in the RegionMaker pass) Block tree after RegionMaker visitor: