pascal-lab / Tai-e-assignments

Tai-e assignments for static program analysis
https://tai-e.pascal-lab.net/
GNU Lesser General Public License v3.0
1.01k stars 233 forks source link

About the A7 #5

Open SunflowerAries opened 1 year ago

SunflowerAries commented 1 year ago

Hi, everyone, I just meet a strange problem: to maintain corresponding StoreField(a.f = x), LoadField(x = a.f), StoreArray(a[*] = x), LoadArray(x = a[*]) statements for variable a, if I use the builtin HashMap just like below

private Map<Var, Set<Stmt>> fieldStore;
private Map<Var, Set<Stmt>> fieldLoad;
private Map<Var, Set<Stmt>> arrayLoad;
private Map<Var, Set<Stmt>> arrayStore;
...

protected void initialize() {
    String ptaId = getOptions().getString("pta");
    PointerAnalysisResult pta = World.get().getResult(ptaId);
    // You can do initialization work here

    fieldLoad = new HashMap<>();
    fieldStore = new HashMap<>();
    arrayLoad = new HashMap<>();
    arrayStore = new HashMap<>();
    for (Var var : pta.getVars()) {
        fieldStore.put(var, new HashSet<>());
        fieldLoad.put(var, new HashSet<>());
        arrayStore.put(var, new HashSet<>());
        arrayLoad.put(var, new HashSet<>());
        for (Var other : pta.getVars()) {
            for (Obj obj : pta.getPointsToSet(var)) {
                if (pta.getPointsToSet(other).contains(obj)) {
                    fieldStore.get(var).addAll(other.getStoreFields());
                    fieldLoad.get(var).addAll(other.getLoadFields());
                    arrayLoad.get(var).addAll(other.getLoadArrays());
                    arrayStore.get(var).addAll(other.getStoreArrays());
                    break;
                }
            }
        }
    }
}

there are always some testcases that I cannot pass through in online judge, however if I switch to Guava HashMultiMap(com.google.common.collect.HashMultimap) and initialize the map like below(others is still)

private final Multimap<Var, Stmt> fieldStore = HashMultimap.create();
private final Multimap<Var, Stmt> fieldLoad = HashMultimap.create();
private final Multimap<Var, Stmt> arrayLoad = HashMultimap.create();
private final Multimap<Var, Stmt> arrayStore = HashMultimap.create();
...

protected void initialize() {
    String ptaId = getOptions().getString("pta");
    PointerAnalysisResult pta = World.get().getResult(ptaId);
    // You can do initialization work here

    for (Var var : pta.getVars()) {
        for (Var other : pta.getVars()) {
            for (Obj obj : pta.getPointsToSet(var)) {
                if (pta.getPointsToSet(other).contains(obj)) {
                    fieldStore.putAll(var, other.getStoreFields());
                    fieldLoad.putAll(var, other.getLoadFields());
                    arrayLoad.putAll(var, other.getLoadArrays());
                    arrayStore.putAll(var, other.getStoreArrays());
                    break;
                }
            }
        }
    }
}

everything is ok. And after the initialization, I only use these maps as lookup tables just look like fieldLoad.get(var).stream().map(fieldStore -> (LoadField) fieldStore) .filter(fieldLoad -> fieldLoad.getFieldRef().resolve() == ((StoreField) stmt).getFieldRef().resolve()).collect(Collectors.toSet())

So if there are some bugs in jvm or someone can point out that I miss some important issues?

DianQK commented 1 year ago

fieldLoad.get(var) could be null.

I submitted a lot of Wrong Answers before I saw your issue, thanks a lot. :) I got Accepted results using HashMultimap as well. I was also curious as to why HashMap was having problems, so I did some research.

I am just a Java beginner, could it be that the Hash logic is causing undefined behavior? To do the relevant tests, I changed to a List (which may remove the effect of Hash?). Here is the sample code:

class MyMap<S> {
  List<Var> vars = new ArrayList<>();
  List<List<S>> stmtGroups = new ArrayList<>();

  Collection<S> get(Var var) {
    int index = vars.indexOf(var);
    return stmtGroups.get(index);
  }

  ...
}

Then ... Analyze 65 methods, pass 48 methods 😢。

I re-looked some code at the Tai-e and most of it uses HashSet or a similar class, I think I'm wrong.

However, it occurred to me that the index might be equal to -1, in other words, we might encounter a var that pta.getVars() doesn't have.

I compared the behavior of HashMap and HashMultimap:

HashMultimap<Integer, Integer> hashMultimap = HashMultimap.create();
HashMap<Integer, Set<Integer>> hashMap = new HashMap<>();
System.out.println("hashMultimap.get(0) = " + hashMultimap.get(0));
System.out.println("hashMap.get(0) = " + hashMap.get(0));
// output:
// hashMultimap.get(0) = []
// hashMap.get(0) = null

I modified the code and the submission results in Accepted:

class MyMap<S> {
  List<Var> vars = new ArrayList<>();
  List<List<S>> stmtGroups = new ArrayList<>();

  Collection<S> get(Var var) {
        int index = vars.indexOf(var);
        if (index >= 0) {
            return stmtGroups.get(index);
        } else {
            return new HashSet<>();
        }
  }

  ...
}

But how to find the corresponding test case? 🧐

SunflowerAries commented 1 year ago

fieldLoad.get(var) could be null.

I submitted a lot of Wrong Answers before I saw your issue, thanks a lot. :) I got Accepted results using HashMultimap as well. I was also curious as to why HashMap was having problems, so I did some research.

I am just a Java beginner, could it be that the Hash logic is causing undefined behavior? To do the relevant tests, I changed to a List (which may remove the effect of Hash?). Here is the sample code:

class MyMap<S> {
  List<Var> vars = new ArrayList<>();
  List<List<S>> stmtGroups = new ArrayList<>();

  Collection<S> get(Var var) {
    int index = vars.indexOf(var);
    return stmtGroups.get(index);
  }

  ...
}

Then ... Analyze 65 methods, pass 48 methods 😢。

I re-looked some code at the Tai-e and most of it uses HashSet or a similar class, I think I'm wrong.

However, it occurred to me that the index might be equal to -1, in other words, we might encounter a var that pta.getVars() doesn't have.

I compared the behavior of HashMap and HashMultimap:

HashMultimap<Integer, Integer> hashMultimap = HashMultimap.create();
HashMap<Integer, Set<Integer>> hashMap = new HashMap<>();
System.out.println("hashMultimap.get(0) = " + hashMultimap.get(0));
System.out.println("hashMap.get(0) = " + hashMap.get(0));
// output:
// hashMultimap.get(0) = []
// hashMap.get(0) = null

I modified the code and the submission results in Accepted:

class MyMap<S> {
  List<Var> vars = new ArrayList<>();
  List<List<S>> stmtGroups = new ArrayList<>();

  Collection<S> get(Var var) {
      int index = vars.indexOf(var);
      if (index >= 0) {
          return stmtGroups.get(index);
      } else {
          return new HashSet<>();
      }
  }

  ...
}

But how to find the corresponding test case? 🧐

Thx for your reply, it bothers me for a long time. I add the null check as you suggest and pass the oj test with Hashmap. Then I just go through the test driver, and find the overall analysis processes as cfg->cspta->cg->icfg->inter-constprop, cspta will construct a csCallGraph which consists only reachable methods and callSitetoCallee edges then cg will construct a CallGraph that removes the context information, then icfg will construct the flowGraph with both inter- and intra- edges(inter- comes from cg while intra- comes from cfg), and in inter-constprop we iterate the icfg with worklist. And if I do not miss anything, back to the inter-constprop, as for the initialization of fieldLoad, it comes from pta.getVars() which should contain all the variables of reachable methods and within worklist, we only iterate over the statements of the reachable methods from icfg, so it's not supposed to find any variables that do not occur in the initialization phase, but it does. Hope someone can give any clues.

DianQK commented 1 year ago

I think the null check must be because there is some statement that appears in the constant propagation that does not appear in the pointer analysis.

The code for verification:

for (JMethod method : callGraph) {
    varSet.addAll(method.getIR().getVars());
}
varSet.addAll(pta.getVars());

When the null check is removed, this can also pass the test case.

So I made some attempts and obtained:

操作太频繁,请稍后重试 你已进入【提交冷静期】, 不要面向OJ编程哦! 可以考虑重新读读实验说明书呢! 每个人每道题每60分钟只能提交10次, 不能频繁提交. 请等待 565 秒后再提交. (Do NOT submit too frequently. Each user can only submit 10 times per 60 minutes for each problem) You can submit again after 565 seconds.

I am very sorry, but I have made some progress:

private class StmtProcessor implements StmtVisitor<Void> {

    @Override
    public Void visit(LoadArray stmt) {
      varSet.add(getAllVar(stmt));
        return null;
    }

    @Override
    public Void visit(LoadField stmt) {
      varSet.add(getAllVar(stmt));
        return null;
    }

}

This can also pass.

So I think there are some statements that (CS)Var has not been added to the pointer analysis, but are still reachable. Not the New and Invoke statements.

So...

class InstanceField {

    public static void main(String[] args) {
        A a1 = new B();
        a1.f = 111;
        B b1 = (B)a1;
        int x = b1.f;
    }

}

class A {
    int f;
}

class B extends A {

}

In the previous assignment, our pointer analysis did not handle the Cast statement.

A7/tai-e/src/main/java/pascal/taie/analysis/pta/cs/Solver.java:

@Override
public Void visit(Cast stmt) {
    addPFGEdge(csManager.getCSVar(context, stmt.getRValue().getValue()), csManager.getCSVar(context, stmt.getLValue()));
    return null;
}
SunflowerAries commented 1 year ago

I think the null check must be because there is some statement that appears in the constant propagation that does not appear in the pointer analysis.

The code for verification:

for (JMethod method : callGraph) {
  varSet.addAll(method.getIR().getVars());
}
varSet.addAll(pta.getVars());

When the null check is removed, this can also pass the test case.

So I made some attempts and obtained:

操作太频繁,请稍后重试 你已进入【提交冷静期】, 不要面向OJ编程哦! 可以考虑重新读读实验说明书呢! 每个人每道题每60分钟只能提交10次, 不能频繁提交. 请等待 565 秒后再提交. (Do NOT submit too frequently. Each user can only submit 10 times per 60 minutes for each problem) You can submit again after 565 seconds.

I am very sorry, but I have made some progress:

private class StmtProcessor implements StmtVisitor<Void> {

  @Override
  public Void visit(LoadArray stmt) {
    varSet.add(getAllVar(stmt));
      return null;
  }

  @Override
  public Void visit(LoadField stmt) {
    varSet.add(getAllVar(stmt));
      return null;
  }

}

This can also pass.

So I think there are some statements that (CS)Var has not been added to the pointer analysis, but are still reachable. Not the New and Invoke statements.

So...

class InstanceField {

    public static void main(String[] args) {
        A a1 = new B();
        a1.f = 111;
        B b1 = (B)a1;
        int x = b1.f;
    }

}

class A {
    int f;
}

class B extends A {

}

In the previous assignment, our pointer analysis did not handle the Cast statement.

A7/tai-e/src/main/java/pascal/taie/analysis/pta/cs/Solver.java:

@Override
public Void visit(Cast stmt) {
  addPFGEdge(csManager.getCSVar(context, stmt.getRValue().getValue()), csManager.getCSVar(context, stmt.getLValue()));
  return null;
}

I guess you're right. As we can see, for reachable methods, cspta will use visitor-based StmtProcessor to process statements within methods.

private void addReachable(CSMethod csMethod) {
    // TODO - finish me
    if (callGraph.addReachableMethod(csMethod)) {
        StmtProcessor stmtProcessor = new StmtProcessor(csMethod);
        for (Stmt stmt : csMethod.getMethod().getIR().getStmts()) {
            stmt.accept(stmtProcessor);
        }
    }
}

However, we only cover a few cases, like New, Copy, LoadField and so on, and there are still many kinds of statements that are not yet processed, and if the StmtProcessor of oj is same with ours, some variables will not be included in pta.getVars() as you suggest.

public interface StmtVisitor<T> {

    default T visit(New stmt) {
        return visitDefault(stmt);
    }

    default T visit(AssignLiteral stmt) {
        return visitDefault(stmt);
    }

    default T visit(Copy stmt) {
        return visitDefault(stmt);
    }

    default T visit(LoadArray stmt) {
        return visitDefault(stmt);
    }

    default T visit(StoreArray stmt) {
        return visitDefault(stmt);
    }

    default T visit(LoadField stmt) {
        return visitDefault(stmt);
    }

    default T visit(StoreField stmt) {
        return visitDefault(stmt);
    }

    default T visit(Binary stmt) {
        return visitDefault(stmt);
    }

    default T visit(Unary stmt) {
        return visitDefault(stmt);
    }

    default T visit(InstanceOf stmt) {
        return visitDefault(stmt);
    }

    default T visit(Cast stmt) {
        return visitDefault(stmt);
    }

    default T visit(Goto stmt) {
        return visitDefault(stmt);
    }

    default T visit(If stmt) {
        return visitDefault(stmt);
    }

    default T visit(TableSwitch stmt) {
        return visitDefault(stmt);
    }

    default T visit(LookupSwitch stmt) {
        return visitDefault(stmt);
    }

    default T visit(Invoke stmt) {
        return visitDefault(stmt);
    }

    default T visit(Return stmt) {
        return visitDefault(stmt);
    }

    default T visit(Throw stmt) {
        return visitDefault(stmt);
    }

    default T visit(Catch stmt) {
        return visitDefault(stmt);
    }

    default T visit(Monitor stmt) {
        return visitDefault(stmt);
    }

    default T visit(Nop stmt) {
        return visitDefault(stmt);
    }

    default T visitDefault(Stmt stmt) {
        return null;
    }
}