tech-srl / code2seq

Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"
http://code2seq.org
MIT License
545 stars 164 forks source link

Help with implementing local service with JavaExtractor #105

Open lyriccoder opened 2 years ago

lyriccoder commented 2 years ago

Hi @urialon.

Could you please tell me how do you construct a json for the service https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods?

I requested the following procedure (you used it in the previous issue):

public boolean f(Set<String> set, String value) {  for (String entry : set) {    if (entry.equalsIgnoreCase(value)) {  return true ;    }  }  return false;  }

I tried to run JavaExtractor with the following params (I found them in extract.py): Running: java -cp /app/JavaExtractor/JPredict/target/JavaExtractor-0.0.2-SNAPSHOT.jar JavaExtractor.App --max_path_length 9 --max_path_width 2 --file /app/Input.java.

It returned me the following result:

f boolean,Prim0|Mth|Nm1,METHOD_NAME boolean,Prim0|Mth|Prm|VDID0,set boolean,Prim0|Mth|Prm|Cls|Cls0,string METHOD_NAME,Nm1|Mth|Prm|VDID0,set METHOD_NAME,Nm1|Mth|Prm|Cls|Cls0,string METHOD_NAME,Nm1|Mth|Prm|VDID0,value METHOD_NAME,Nm1|Mth|Prm|Cls1,string set,VDID0|Prm|Cls|Cls0,string set,VDID0|Prm|Mth|Prm|VDID0,value set,VDID0|Prm|Mth|Prm|Cls1,string set,VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0,string set,VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry set,VDID0|Prm|Mth|Bk|Foreach|Nm1,set set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true set,VDID0|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls0|Cls|Prm|Mth|Prm|VDID0,value string,Cls0|Cls|Prm|Mth|Prm|Cls1,string string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|Cls0,string string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry string,Cls0|Cls|Prm|Mth|Bk|Foreach|Nm1,set string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls0|Cls|Prm|Mth|Bk|Ret|BoolEx0,false value,VDID0|Prm|Cls1,string value,VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0,string value,VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry value,VDID0|Prm|Mth|Bk|Foreach|Nm1,set value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true value,VDID0|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls1|Prm|Mth|Bk|Foreach|VDE|Cls0,string string,Cls1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry string,Cls1|Prm|Mth|Bk|Foreach|Nm1,set string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true string,Cls1|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls0|VDE|VD|VDID0,entry string,Cls0|VDE|Foreach|Nm1,set string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm0,entry string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm2,value string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls0|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0,true string,Cls0|VDE|Foreach|Bk|Ret|BoolEx0,false entry,VDID0|VD|VDE|Foreach|Nm1,set entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm0,entry entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm2,value entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case entry,VDID0|VD|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0,true entry,VDID0|VD|VDE|Foreach|Bk|Ret|BoolEx0,false set,Nm1|Foreach|Bk|If|Cal0|Nm0,entry set,Nm1|Foreach|Bk|If|Cal0|Nm2,value set,Nm1|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case set,Nm1|Foreach|Bk|If|Bk|Ret|BoolEx0,true set,Nm1|Foreach|Bk|Ret|BoolEx0,false entry,Nm0|Cal|Nm2,value entry,Nm0|Cal|If|Bk|Ret|BoolEx0,true entry,Nm0|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false value,Nm2|Cal|Nm3,equals|ignore|case value,Nm2|Cal|If|Bk|Ret|BoolEx0,true value,Nm2|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false equals|ignore|case,Nm3|Cal|If|Bk|Ret|BoolEx0,true equals|ignore|case,Nm3|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false true,BoolEx0|Ret|Bk|If|Bk|Foreach|Bk|Ret|BoolEx0,false\n'

I noticed that some values are missed, e.g.:

                "name1": "boolean",
                "name2": "set",
                "shortPath": "Prim0|Mth|Prm|GenericClass1",
                "path": "(PrimitiveType0)^(MethodDeclaration)_(Parameter)_(GenericClass1)",
                "name1NodeId": 2,
                "name2NodeId": 6,
                "name1TokenNum": 0,
                "name2TokenNum": 0

How can I get absolutely the same items? 1) Actually, I see that "name1NodeId", "name2NodeId", "name1TokenNum", "name2TokenNum" are pointed to the nodes of the ast tree and they are not needed in the prediction function. Also, I see that ast tree is not also used. Is it right?

2) So, the last part is to understand how you are getting the response. Usually there are missing values in the result of JavaExtractor, and they are usually related to GenericClass1 value. Also, I noticed that some values in the response are missed, but they exist in the result of JavaExtractor:

{'value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|Cls0,string', 'string,Cls0|Cls|Prm|Mth|Prm|Cls1,string', 'set,VDID0|Prm|Cls|Cls0,string', 'string,Cls0|Cls|Prm|Mth|Prm|VDID0,value', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value', 'METHOD_NAME,Nm1|Mth|Prm|Cls|Cls0,string', 'set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true', 'true,BoolEx0|Ret|Bk|If|Bk|Foreach|Bk|Ret|BoolEx0,false\\n', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|Nm1,set', 'string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case', 'string,Cls0|Cls|Prm|Mth|Bk|Ret|BoolEx0,false', 'string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true', 'boolean,Prim0|Mth|Prm|Cls|Cls0,string'}

Could you please tell me, how do u run JavaExtractor on your server?

urialon commented 2 years ago

Hi @lyriccoder , Can you please remind me what are you trying to achieve?

You are right that name1NodeId etc are not needed in the prediction, they are for visualization purposes on the code2seq.org website. So why do you need them? Or which fields do you need?

lyriccoder commented 2 years ago

Currently, Code2seq sends a request to a server. I want to run it locally, without any network. I need to get the same json as your service returns.

Since I can run JavaExtractor manually, I want to run it locally, get the results and create json which is passed to the prediction function. But I see that the results of JavaExtractor and your service are different. I need to know the way you run JavaExtractor on your server, maybe you have additional parameters, etc. I just want to create the same json you return from the AWS server.

For some reason, your server returns more data: Some nodes are included in your response, but not included when I run JavaExtractor. E.g., I noticed if the shortPath field of the json contains the GenericClass substring, it won't be included in the result of JavaExtrator.

urialon commented 2 years ago

Hi @lyriccoder , So my question is why do you need exactly the same json? Some fields there are used only for visualization on the website.

lyriccoder commented 2 years ago

I meant the json with fields which are needed for prediction. Some nodes are missing... Seems you are using another parameters in JavaExtractor. It can affect the prediction.

urialon commented 2 years ago

Can you please provide a code example that returns a different response in the service and in the local extractor? (where the difference is in the content, and not only the format?)

On Thu, Oct 28, 2021 at 4:38 AM lyriccoder @.***> wrote:

I meant the json with fields which are needed for prediction. Some nodes are missing... Seems you are using another parameters in JavaExtractor. It can affect the prediction.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/105#issuecomment-953632700, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMBQPNAA4KWG2C2QZFTUJED2BANCNFSM5GYA2QTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

lyriccoder commented 2 years ago

I did it in the previous message. Ok, let me show it again.

1) Here is the request I send:

curl -X POST https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods -d '{"code": "public boolean f(Set<String> set, String value) {  for (String entry : set) {    if (entry.equalsIgnoreCase(value)) {  return true ;    }  }  return false;  }", "decompose":"true"}'

I get the following response:

Http Response ```json [ { "target": "f", "paths": [ { "name1": "boolean", "name2": "METHOD_NAME", "shortPath": "Prim0|Mth|Nm1", "path": "(PrimitiveType0)^(MethodDeclaration)_(NameExpr1)", "name1NodeId": 2, "name2NodeId": 3, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "boolean", "name2": "set", "shortPath": "Prim0|Mth|Prm|VDID0", "path": "(PrimitiveType0)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 2, "name2NodeId": 5, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "boolean", "name2": "set", "shortPath": "Prim0|Mth|Prm|GenericClass1", "path": "(PrimitiveType0)^(MethodDeclaration)_(Parameter)_(GenericClass1)", "name1NodeId": 2, "name2NodeId": 6, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "boolean", "name2": "string", "shortPath": "Prim0|Mth|Prm|GenericClass|Cls0", "path": "(PrimitiveType0)^(MethodDeclaration)_(Parameter)_(GenericClass)_(ClassOrInterfaceType0)", "name1NodeId": 2, "name2NodeId": 7, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "METHOD_NAME", "name2": "set", "shortPath": "Nm1|Mth|Prm|VDID0", "path": "(NameExpr1)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 3, "name2NodeId": 5, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "METHOD_NAME", "name2": "set", "shortPath": "Nm1|Mth|Prm|GenericClass1", "path": "(NameExpr1)^(MethodDeclaration)_(Parameter)_(GenericClass1)", "name1NodeId": 3, "name2NodeId": 6, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "METHOD_NAME", "name2": "string", "shortPath": "Nm1|Mth|Prm|GenericClass|Cls0", "path": "(NameExpr1)^(MethodDeclaration)_(Parameter)_(GenericClass)_(ClassOrInterfaceType0)", "name1NodeId": 3, "name2NodeId": 7, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "METHOD_NAME", "name2": "value", "shortPath": "Nm1|Mth|Prm|VDID0", "path": "(NameExpr1)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 3, "name2NodeId": 9, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "METHOD_NAME", "name2": "string", "shortPath": "Nm1|Mth|Prm|Cls1", "path": "(NameExpr1)^(MethodDeclaration)_(Parameter)_(ClassOrInterfaceType1)", "name1NodeId": 3, "name2NodeId": 10, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "set", "shortPath": "VDID0|Prm|GenericClass1", "path": "(VariableDeclaratorId0)^(Parameter)_(GenericClass1)", "name1NodeId": 5, "name2NodeId": 6, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "string", "shortPath": "VDID0|Prm|GenericClass|Cls0", "path": "(VariableDeclaratorId0)^(Parameter)_(GenericClass)_(ClassOrInterfaceType0)", "name1NodeId": 5, "name2NodeId": 7, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "value", "shortPath": "VDID0|Prm|Mth|Prm|VDID0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 5, "name2NodeId": 9, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "string", "shortPath": "VDID0|Prm|Mth|Prm|Cls1", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(Parameter)_(ClassOrInterfaceType1)", "name1NodeId": 5, "name2NodeId": 10, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "string", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(ClassOrInterfaceType0)", "name1NodeId": 5, "name2NodeId": 14, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "set", "name2": "entry", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 5, "name2NodeId": 16, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "set", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Nm1", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(NameExpr1)", "name1NodeId": 5, "name2NodeId": 17, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "entry", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 5, "name2NodeId": 21, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "set", "name2": "value", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 5, "name2NodeId": 23, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "equals|ignore|case", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 5, "name2NodeId": 24, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "false", "shortPath": "VDID0|Prm|Mth|Bk|Ret|BoolEx0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 5, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "string", "shortPath": "GenericClass|Cls0", "path": "(GenericClass)_(ClassOrInterfaceType0)", "name1NodeId": 6, "name2NodeId": 7, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "value", "shortPath": "GenericClass1|Prm|Mth|Prm|VDID0", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 6, "name2NodeId": 9, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "string", "shortPath": "GenericClass1|Prm|Mth|Prm|Cls1", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(Parameter)_(ClassOrInterfaceType1)", "name1NodeId": 6, "name2NodeId": 10, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "string", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|VDE|Cls0", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(ClassOrInterfaceType0)", "name1NodeId": 6, "name2NodeId": 14, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "set", "name2": "entry", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 6, "name2NodeId": 16, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "set", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|Nm1", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(NameExpr1)", "name1NodeId": 6, "name2NodeId": 17, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "entry", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 6, "name2NodeId": 21, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "set", "name2": "value", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 6, "name2NodeId": 23, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "set", "name2": "equals|ignore|case", "shortPath": "GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 6, "name2NodeId": 24, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "set", "name2": "false", "shortPath": "GenericClass1|Prm|Mth|Bk|Ret|BoolEx0", "path": "(GenericClass1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 6, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "string", "name2": "value", "shortPath": "Cls0|GenericClass|Prm|Mth|Prm|VDID0", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(Parameter)_(VariableDeclaratorId0)", "name1NodeId": 7, "name2NodeId": 9, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "string", "name2": "string", "shortPath": "Cls0|GenericClass|Prm|Mth|Prm|Cls1", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(Parameter)_(ClassOrInterfaceType1)", "name1NodeId": 7, "name2NodeId": 10, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "string", "name2": "string", "shortPath": "Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|Cls0", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(ClassOrInterfaceType0)", "name1NodeId": 7, "name2NodeId": 14, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "string", "name2": "entry", "shortPath": "Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|VD|VDID0", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 7, "name2NodeId": 16, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "string", "name2": "set", "shortPath": "Cls0|GenericClass|Prm|Mth|Bk|Foreach|Nm1", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(NameExpr1)", "name1NodeId": 7, "name2NodeId": 17, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "string", "name2": "false", "shortPath": "Cls0|GenericClass|Prm|Mth|Bk|Ret|BoolEx0", "path": "(ClassOrInterfaceType0)^(GenericClass)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 7, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "value", "name2": "string", "shortPath": "VDID0|Prm|Cls1", "path": "(VariableDeclaratorId0)^(Parameter)_(ClassOrInterfaceType1)", "name1NodeId": 9, "name2NodeId": 10, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "value", "name2": "string", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(ClassOrInterfaceType0)", "name1NodeId": 9, "name2NodeId": 14, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "value", "name2": "entry", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 9, "name2NodeId": 16, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "value", "name2": "set", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Nm1", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(NameExpr1)", "name1NodeId": 9, "name2NodeId": 17, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "value", "name2": "entry", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 9, "name2NodeId": 21, "name1TokenNum": 0, "name2TokenNum": 2 }, { "name1": "value", "name2": "value", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 9, "name2NodeId": 23, "name1TokenNum": 0, "name2TokenNum": 1 }, { "name1": "value", "name2": "equals|ignore|case", "shortPath": "VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 9, "name2NodeId": 24, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "value", "name2": "false", "shortPath": "VDID0|Prm|Mth|Bk|Ret|BoolEx0", "path": "(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 9, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "string", "name2": "string", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|VDE|Cls0", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(ClassOrInterfaceType0)", "name1NodeId": 10, "name2NodeId": 14, "name1TokenNum": 1, "name2TokenNum": 2 }, { "name1": "string", "name2": "entry", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 10, "name2NodeId": 16, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "string", "name2": "set", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|Nm1", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(NameExpr1)", "name1NodeId": 10, "name2NodeId": 17, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "string", "name2": "entry", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 10, "name2NodeId": 21, "name1TokenNum": 1, "name2TokenNum": 2 }, { "name1": "string", "name2": "value", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 10, "name2NodeId": 23, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "string", "name2": "equals|ignore|case", "shortPath": "Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 10, "name2NodeId": 24, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "string", "name2": "false", "shortPath": "Cls1|Prm|Mth|Bk|Ret|BoolEx0", "path": "(ClassOrInterfaceType1)^(Parameter)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 10, "name2NodeId": 29, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "string", "name2": "entry", "shortPath": "Cls0|VDE|VD|VDID0", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)_(VariableDeclarator)_(VariableDeclaratorId0)", "name1NodeId": 14, "name2NodeId": 16, "name1TokenNum": 2, "name2TokenNum": 1 }, { "name1": "string", "name2": "set", "shortPath": "Cls0|VDE|Foreach|Nm1", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)_(NameExpr1)", "name1NodeId": 14, "name2NodeId": 17, "name1TokenNum": 2, "name2TokenNum": 1 }, { "name1": "string", "name2": "entry", "shortPath": "Cls0|VDE|Foreach|Bk|If|Cal0|Nm0", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 14, "name2NodeId": 21, "name1TokenNum": 2, "name2TokenNum": 2 }, { "name1": "string", "name2": "value", "shortPath": "Cls0|VDE|Foreach|Bk|If|Cal0|Nm2", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 14, "name2NodeId": 23, "name1TokenNum": 2, "name2TokenNum": 1 }, { "name1": "string", "name2": "equals|ignore|case", "shortPath": "Cls0|VDE|Foreach|Bk|If|Cal0|Nm3", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 14, "name2NodeId": 24, "name1TokenNum": 2, "name2TokenNum": 0 }, { "name1": "string", "name2": "true", "shortPath": "Cls0|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 14, "name2NodeId": 27, "name1TokenNum": 2, "name2TokenNum": 0 }, { "name1": "string", "name2": "false", "shortPath": "Cls0|VDE|Foreach|Bk|Ret|BoolEx0", "path": "(ClassOrInterfaceType0)^(VariableDeclarationExpr)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 14, "name2NodeId": 29, "name1TokenNum": 2, "name2TokenNum": 0 }, { "name1": "entry", "name2": "set", "shortPath": "VDID0|VD|VDE|Foreach|Nm1", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)_(NameExpr1)", "name1NodeId": 16, "name2NodeId": 17, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "entry", "name2": "entry", "shortPath": "VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm0", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 16, "name2NodeId": 21, "name1TokenNum": 1, "name2TokenNum": 2 }, { "name1": "entry", "name2": "value", "shortPath": "VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm2", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 16, "name2NodeId": 23, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "entry", "name2": "equals|ignore|case", "shortPath": "VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm3", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 16, "name2NodeId": 24, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "entry", "name2": "true", "shortPath": "VDID0|VD|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 16, "name2NodeId": 27, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "entry", "name2": "false", "shortPath": "VDID0|VD|VDE|Foreach|Bk|Ret|BoolEx0", "path": "(VariableDeclaratorId0)^(VariableDeclarator)^(VariableDeclarationExpr)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 16, "name2NodeId": 29, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "set", "name2": "entry", "shortPath": "Nm1|Foreach|Bk|If|Cal0|Nm0", "path": "(NameExpr1)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr0)", "name1NodeId": 17, "name2NodeId": 21, "name1TokenNum": 1, "name2TokenNum": 2 }, { "name1": "set", "name2": "value", "shortPath": "Nm1|Foreach|Bk|If|Cal0|Nm2", "path": "(NameExpr1)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr2)", "name1NodeId": 17, "name2NodeId": 23, "name1TokenNum": 1, "name2TokenNum": 1 }, { "name1": "set", "name2": "equals|ignore|case", "shortPath": "Nm1|Foreach|Bk|If|Cal0|Nm3", "path": "(NameExpr1)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(MethodCallExpr0)_(NameExpr3)", "name1NodeId": 17, "name2NodeId": 24, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "set", "name2": "true", "shortPath": "Nm1|Foreach|Bk|If|Bk|Ret|BoolEx0", "path": "(NameExpr1)^(ForeachStmt)_(BlockStmt)_(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 17, "name2NodeId": 27, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "set", "name2": "false", "shortPath": "Nm1|Foreach|Bk|Ret|BoolEx0", "path": "(NameExpr1)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 17, "name2NodeId": 29, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "entry", "name2": "value", "shortPath": "Nm0|Cal|Nm2", "path": "(NameExpr0)^(MethodCallExpr)_(NameExpr2)", "name1NodeId": 21, "name2NodeId": 23, "name1TokenNum": 2, "name2TokenNum": 1 }, { "name1": "entry", "name2": "true", "shortPath": "Nm0|Cal|If|Bk|Ret|BoolEx0", "path": "(NameExpr0)^(MethodCallExpr)^(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 21, "name2NodeId": 27, "name1TokenNum": 2, "name2TokenNum": 0 }, { "name1": "entry", "name2": "false", "shortPath": "Nm0|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0", "path": "(NameExpr0)^(MethodCallExpr)^(IfStmt)^(BlockStmt)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 21, "name2NodeId": 29, "name1TokenNum": 2, "name2TokenNum": 0 }, { "name1": "value", "name2": "equals|ignore|case", "shortPath": "Nm2|Cal|Nm3", "path": "(NameExpr2)^(MethodCallExpr)_(NameExpr3)", "name1NodeId": 23, "name2NodeId": 24, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "value", "name2": "true", "shortPath": "Nm2|Cal|If|Bk|Ret|BoolEx0", "path": "(NameExpr2)^(MethodCallExpr)^(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 23, "name2NodeId": 27, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "value", "name2": "false", "shortPath": "Nm2|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0", "path": "(NameExpr2)^(MethodCallExpr)^(IfStmt)^(BlockStmt)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 23, "name2NodeId": 29, "name1TokenNum": 1, "name2TokenNum": 0 }, { "name1": "equals|ignore|case", "name2": "true", "shortPath": "Nm3|Cal|If|Bk|Ret|BoolEx0", "path": "(NameExpr3)^(MethodCallExpr)^(IfStmt)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 24, "name2NodeId": 27, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "equals|ignore|case", "name2": "false", "shortPath": "Nm3|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0", "path": "(NameExpr3)^(MethodCallExpr)^(IfStmt)^(BlockStmt)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 24, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 }, { "name1": "true", "name2": "false", "shortPath": "BoolEx0|Ret|Bk|If|Bk|Foreach|Bk|Ret|BoolEx0", "path": "(BooleanLiteralExpr0)^(ReturnStmt)^(BlockStmt)^(IfStmt)^(BlockStmt)^(ForeachStmt)^(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)", "name1NodeId": 27, "name2NodeId": 29, "name1TokenNum": 0, "name2TokenNum": 0 } ], "ast": { "id": 0, "children": [ { "id": 1, "type": "MethodDeclaration", "children": [ { "id": 2, "name": "boolean", "type": "PrimitiveType" }, { "id": 3, "name": "f", "type": "NameExpr" }, { "id": 4, "type": "Parameter", "children": [ { "id": 6, "name": "Set", "type": "ClassOrInterfaceType", "children": [ { "id": 7, "name": "String", "type": "ClassOrInterfaceType" } ] }, { "id": 5, "name": "set", "type": "VariableDeclaratorId" } ] }, { "id": 8, "type": "Parameter", "children": [ { "id": 10, "name": "String", "type": "ClassOrInterfaceType" }, { "id": 9, "name": "value", "type": "VariableDeclaratorId" } ] }, { "id": 11, "type": "BlockStmt", "children": [ { "id": 12, "type": "ForeachStmt", "children": [ { "id": 13, "type": "VariableDeclarationExpr", "children": [ { "id": 14, "name": "String", "type": "ClassOrInterfaceType" }, { "id": 15, "type": "VariableDeclarator", "children": [ { "id": 16, "name": "entry", "type": "VariableDeclaratorId" } ] } ] }, { "id": 17, "name": "set", "type": "NameExpr" }, { "id": 18, "type": "BlockStmt", "children": [ { "id": 19, "type": "IfStmt", "children": [ { "id": 20, "type": "MethodCallExpr", "children": [ { "id": 21, "name": "entry", "type": "NameExpr" }, { "id": 24, "name": "equalsIgnoreCase", "type": "NameExpr" }, { "id": 23, "name": "value", "type": "NameExpr" } ] }, { "id": 25, "type": "BlockStmt", "children": [ { "id": 26, "type": "ReturnStmt", "children": [ { "id": 27, "name": "true", "type": "BooleanLiteralExpr" } ] } ] } ] } ] } ] }, { "id": 28, "type": "ReturnStmt", "children": [ { "id": 29, "name": "false", "type": "BooleanLiteralExpr" } ] } ] } ] } ] } } ] ```

2) I run JavaExtractor with the following params:

java -cp JavaExtractor/JPredict/target/JavaExtractor-0.0.2-SNAPSHOT.jar JavaExtractor.App --max_path_length 9 --max_path_width 2 --file Input.java

I get the following response:

Response of JavaExtractor ``` f boolean,Prim0|Mth|Nm1,METHOD_NAME boolean,Prim0|Mth|Prm|VDID0,set boolean,Prim0|Mth|Prm|Cls|Cls0,string METHOD_NAME,Nm1|Mth|Prm|VDID0,set METHOD_NAME,Nm1|Mth|Prm|Cls|Cls0,string METHOD_NAME,Nm1|Mth|Prm|VDID0,value METHOD_NAME,Nm1|Mth|Prm|Cls1,string set,VDID0|Prm|Cls|Cls0,string set,VDID0|Prm|Mth|Prm|VDID0,value set,VDID0|Prm|Mth|Prm|Cls1,string set,VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0,string set,VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry set,VDID0|Prm|Mth|Bk|Foreach|Nm1,set set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case set,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true set,VDID0|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls0|Cls|Prm|Mth|Prm|VDID0,value string,Cls0|Cls|Prm|Mth|Prm|Cls1,string string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|Cls0,string string,Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry string,Cls0|Cls|Prm|Mth|Bk|Foreach|Nm1,set string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value string,Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls0|Cls|Prm|Mth|Bk|Ret|BoolEx0,false value,VDID0|Prm|Cls1,string value,VDID0|Prm|Mth|Bk|Foreach|VDE|Cls0,string value,VDID0|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry value,VDID0|Prm|Mth|Bk|Foreach|Nm1,set value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case value,VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true value,VDID0|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls1|Prm|Mth|Bk|Foreach|VDE|Cls0,string string,Cls1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0,entry string,Cls1|Prm|Mth|Bk|Foreach|Nm1,set string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0,entry string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2,value string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls1|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0,true string,Cls1|Prm|Mth|Bk|Ret|BoolEx0,false string,Cls0|VDE|VD|VDID0,entry string,Cls0|VDE|Foreach|Nm1,set string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm0,entry string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm2,value string,Cls0|VDE|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case string,Cls0|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0,true string,Cls0|VDE|Foreach|Bk|Ret|BoolEx0,false entry,VDID0|VD|VDE|Foreach|Nm1,set entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm0,entry entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm2,value entry,VDID0|VD|VDE|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case entry,VDID0|VD|VDE|Foreach|Bk|If|Bk|Ret|BoolEx0,true entry,VDID0|VD|VDE|Foreach|Bk|Ret|BoolEx0,false set,Nm1|Foreach|Bk|If|Cal0|Nm0,entry set,Nm1|Foreach|Bk|If|Cal0|Nm2,value set,Nm1|Foreach|Bk|If|Cal0|Nm3,equals|ignore|case set,Nm1|Foreach|Bk|If|Bk|Ret|BoolEx0,true set,Nm1|Foreach|Bk|Ret|BoolEx0,false entry,Nm0|Cal|Nm2,value entry,Nm0|Cal|If|Bk|Ret|BoolEx0,true entry,Nm0|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false value,Nm2|Cal|Nm3,equals|ignore|case value,Nm2|Cal|If|Bk|Ret|BoolEx0,true value,Nm2|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false equals|ignore|case,Nm3|Cal|If|Bk|Ret|BoolEx0,true equals|ignore|case,Nm3|Cal|If|Bk|Foreach|Bk|Ret|BoolEx0,false true,BoolEx0|Ret|Bk|If|Bk|Foreach|Bk|Ret|BoolEx0,false\n' ```

3) The difference is demonstrated below. Those values are in the http response, but they are missed in the result of JavaExtractor:

set GenericClass1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0 entry
METHOD_NAME Nm1|Mth|Prm|GenericClass1 set
set GenericClass1|Prm|Mth|Bk|Foreach|VDE|Cls0 string
set VDID0|Prm|GenericClass|Cls0 string
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|VD|VDID0 entry
string Cls0|GenericClass|Prm|Mth|Prm|Cls1 string
set GenericClass|Cls0 string
set GenericClass1|Prm|Mth|Bk|Foreach|Nm1 set
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|Nm1 set
boolean Prim0|Mth|Prm|GenericClass|Cls0 string
METHOD_NAME Nm1|Mth|Prm|GenericClass|Cls0 string
set GenericClass1|Prm|Mth|Prm|VDID0 value
set GenericClass1|Prm|Mth|Bk|Ret|BoolEx0 false
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3 equals|ignore|case
boolean Prim0|Mth|Prm|GenericClass1 set
string Cls0|GenericClass|Prm|Mth|Prm|VDID0 value
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2 value
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0 entry
set VDID0|Prm|GenericClass1 set
string Cls0|GenericClass|Prm|Mth|Bk|Ret|BoolEx0 false
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|Cls0 string
set GenericClass1|Prm|Mth|Prm|Cls1 string

As you see, the examples with GenericClass is only presented here 4) Inverted difference is demonstrated below. Those values are in the result of JavaExtractor, but they are missed in the http response:

string Cls0|Cls|Prm|Mth|Prm|VDID0 value
string Cls0|Cls|Prm|Mth|Prm|Cls1 string
METHOD_NAME Nm1|Mth|Prm|Cls|Cls0 string
string Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2 value
set VDID0|Prm|Cls|Cls0 string
value VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0 true
boolean Prim0|Mth|Prm|Cls|Cls0 string
string Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|Cls0 string
string Cls1|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0 true
string Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0 entry
set VDID0|Prm|Mth|Bk|Foreach|Bk|If|Bk|Ret|BoolEx0 true
string Cls0|Cls|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3 equals|ignore|case
string Cls0|Cls|Prm|Mth|Bk|Ret|BoolEx0 false
string Cls0|Cls|Prm|Mth|Bk|Foreach|Nm1 set
string Cls0|Cls|Prm|Mth|Bk|Foreach|VDE|VD|VDID0 entry
urialon commented 2 years ago

I see. Good catch!

I will try to debug this. In the meantime, you can use the JavaExtractor because this is the extractor that the model was trained with. The cases of "inverted difference" might be due to the differences in parameters: you used --max_path_length 9 instead of --max_path_length 8.

Uri

AUZQ commented 2 years ago

Hi@lyriccoder Could you please tell me how do you run JavaExtractor locally ? I found your repository code2seq still used the api to extract codes, but in this issue it seemed that you had already run it locally.My purpose is to extract codes locally.

lyriccoder commented 2 years ago

java -cp JavaExtractor/JPredict/target/JavaExtractor-0.0.2-SNAPSHOT.jar JavaExtractor.App --max_path_length=8 --max_path_width=2 --file filename

There is a hardcoded filename inside extractor.py or interactive_predict.py. It rewrites Input.java and passes into the filename param

AUZQ commented 2 years ago

Thanks!!!! I made it!