soot-oss / soot

Soot - A Java optimization framework
GNU Lesser General Public License v2.1
2.87k stars 706 forks source link

java.lang.StackOverflowError for emitUnits #1638

Open HTQianqian opened 3 years ago

HTQianqian commented 3 years ago

Hi, When I try to generate the jimple files from .class, the StackOverflowError exception happens:

Exception in thread "Thread-0" Exception in thread "main" java.lang.StackOverflowError at soot.asm.AsmMethodSource.emitUnits(AsmMethodSource.java:1938) at soot.asm.AsmMethodSource.emitUnits(AsmMethodSource.java:1938) at soot.asm.AsmMethodSource.emitUnits(AsmMethodSource.java:1938) at soot.asm.AsmMethodSource.emitUnits(AsmMethodSource.java:1938) at soot.asm.AsmMethodSource.emitUnits(AsmMethodSource.java:1938)

The commands: `Options.v().set_no_bodies_for_excluded(true); Options.v().set_allow_phantom_refs(true); Options.v().set_output_format(Options.output_format_jimple);

Options.v().set_process_dir(Collections.singletonList("D:\classes")); Options.v().set_src_prec(Options.src_prec_only_class);

Options.v().set_keep_line_number(true);

Scene.v().loadNecessaryClasses();

PackManager.v().writeOutput();`

Input file .class, .jar, .dex, .apk any file caused the issue. test.zip

HTQianqian commented 3 years ago

@StevenArzt still have this problem.

https://github.com/soot-oss/soot/issues/1549

HTQianqian commented 3 years ago

@MarcMil, is there any document for me to fix this bug. I debug the StackOverflowError happens when the edge.preStacks is always different. Can I delete this code with edge.preStacks, it still work.

HTQianqian commented 3 years ago

The jimple IR cann't be generated with the below code. I debug it in AsmMethodSource.java, it has many many edge.preStacks object. The convert cann't be stopped.

   Options.v().set_no_bodies_for_excluded(true);
    Options.v().set_allow_phantom_refs(true);
    Options.v().set_output_format(Options.output_format_jimple);

    Options.v().set_process_dir(Collections.singletonList("test\\classes"));
    Options.v().set_src_prec(Options.src_prec_only_class);

    Options.v().set_keep_line_number(true);
    Options.v().set_keep_offset(false);

    Options.v().setPhaseOption("cg", "trim-clinit:false");
    Options.v().setPhaseOption("cg.spark", "on");
    Options.v().setPhaseOption("cg.spark", "on-fly-cg:true");
    Options.v().setPhaseOption("cg.spark", "string-constants:true");

    Options.v().setPhaseOption("jb", "use-original-names:true");
    Options.v().setPhaseOption("jb.ulp", "off");

    Options.v().setPhaseOption("jb.tr", "ignore-nullpointer-dereferences:true");

    Scene.v().loadNecessaryClasses();

    PackManager.v().writeOutput();
MarcMil commented 3 years ago

Oops. Sorry for the confusion. It's still quite early here and I thought I could fix this fast. I took a look at the original code in https://github.com/soot-oss/soot/blob/d43a7e6332f2ce4e51437472bced13691d970e73/src/main/java/soot/asm/AsmMethodSource.java#L1800 and thought that merely checking for existing prevStacks is enough (as adding is already performed a few lines earlier). However, it does not seem to be so simple. These prevStacks can occupy quite a bit of memory. This is not beautiful, but it works, as long as you have enough memory. Someone may take a look at this in the future. In the meantime, try to start your JVM with the

-Xmx5g

option. Your JVM does not seem to have enough memory by default (the JVM determines the default maximum memory by looking at how much system memory your computer has). How much system memory do you have?

MarcMil commented 3 years ago

So, for some reason, with your input files there are a lot of prevStacks generated. We definitely need to take a look at this at some point

HTQianqian commented 3 years ago

Thanks very much. why we need to check for prevStacks? Can we remove it? It works well. Is there any case that we need this checking?

MarcMil commented 3 years ago

Sadly, I personally am not too familiar with this part of soot. But as it seems, we could run into an infinite loop in some cases when we do not do this check. Basically, it's there to save what the algorithm has already processed to make sure that it has not seen exactly the same thing before. Otherwise, it could keep processing the same edge over and over again. So you can try to remove this check (or replace the add with a contains like I did initially), but it could lead to an infinite loop, which is not good. There should be a better way. Your class files seem to be structured in a way we have not encountered before. We should really consider adding these to our test cases and improve that algorithm in some way. As I do not know the specifics, I don't know which changes would be best here. FIxing the stack overflow itself was rather easy, but improving that algorithm at the core is much more complex. @ericbodden Maybe you have an idea here? On this particular input in the original post soot takes ages and needs a lot of memory, albeit being very small class files.