tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.1k stars 286 forks source link

The results of local run and website run are inconsistent #95

Closed iamchenxin-coder closed 3 years ago

iamchenxin-coder commented 4 years ago

Hi, sir, I run your code2vec project to do prediction, the output results and your results in the website are inconsistent, I would like to ask where I have any problems, plz tell you what should i do? Wesite(https://code2vec.org/) result: image

My local run operations and results: **0 module:** java14m_model.tar.gz

**1 Input.java:* int f(int n) { if (n == 0) { return 1; } else { return n f(n-1); } }

**2 result:** Modify the file: "Input.java" and press any key when ready, or "q" / "quit" / "exit" to exit Original name: f (0.472304) predicted: ['m'] (0.174315) predicted: ['fact'] (0.101178) predicted: ['factorial'] (0.095780) predicted: ['faculty'] (0.067391) predicted: ['get', 'n'] (0.038753) predicted: ['get', 'result', 'score'] (0.013598) predicted: ['is', 'power', 'of'] (0.013113) predicted: ['digit', 'count'] (0.012808) predicted: ['comb'] (0.010760) predicted: ['get', 'in'] Attention: 0.258571 context: n,(NameExpr0)^(BinaryExpr:times)(MethodCallExpr1)(BinaryExpr:minus)(NameExpr0),n 0.159237 context: 1,(IntegerLiteralExpr1)^(BinaryExpr:minus1)^(MethodCallExpr)(NameExpr2),f 0.102514 context: int,(PrimitiveType0)^(MethodDeclaration)_(NameExpr1),METHODNAME 0.082340 context: 0,(IntegerLiteralExpr1)^(BinaryExpr:equals)^(IfStmt)(BlockStmt)(ReturnStmt)(IntegerLiteralExpr0),1 0.060176 context: METHODNAME,(NameExpr1)^(MethodDeclaration)(BlockStmt)(IfStmt)(BlockStmt)(ReturnStmt)(BinaryExpr:times)(NameExpr0),n 0.059877 context: int,(PrimitiveType1)^(Parameter)^(MethodDeclaration)(BlockStmt)(IfStmt)(BlockStmt)(ReturnStmt)(IntegerLiteralExpr0),1 0.042954 context: int,(PrimitiveType0)^(MethodDeclaration)(Parameter)(VariableDeclaratorId0),n 0.041031 context: n,(VariableDeclaratorId0)^(Parameter)^(MethodDeclaration)(BlockStmt)(IfStmt)(BlockStmt)(ReturnStmt)(BinaryExpr:times)(NameExpr0),n 0.033209 context: int,(PrimitiveType0)^(MethodDeclaration)(Parameter)(PrimitiveType1),int 0.029109 context: n,(VariableDeclaratorId0)^(Parameter)_(PrimitiveType1),int

urialon commented 4 years ago

Hi @iamchenxin-coder , Thank you for your interest in code2vec (and code2seq :-))!

The model at code2vec.org is not identical to the model in this repository. It was trained using almost the same code and on the same data, the differences are only because these are different training runs and because we fixed some issues since the website lunch.

The reason is that we first used our original model on the code2vec.org website, such that the examples in the paper and the examples on the website will produce the same results. Later, when we open-sourced the code, we made some refactoring so the code will be easier to read and modify, and fixed a few minor bugs. For this reason, the original model could not be loaded anymore with the new code, so we needed to train the model again to make it available.

But these two models are almost the same. The model that you can download in this repository is actually better than the one on code2vec.org: if you run it on the test set, you'll get about +1 F1 score than the scores that we originally reported in the paper.

Best, Uri

urialon commented 3 years ago

Closing due to inactivity, but feel free to re-open if you have additional questions.