tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.11k stars 286 forks source link

Question about JavaExtractor #126

Open DJjjjhao opened 3 years ago

DJjjjhao commented 3 years ago

I run JavaExtractor to get the ast paths and get the output like this "assign|message|group boolean,600457869,METHOD_NAME". I feel confused about the output. Is the output split by space and what's the meaning of the number "600457869"? Thank you very much!

urialon commented 3 years ago

Hi @DJjjjhao , Sorry for the late response.

The output is split by space where:

  1. The first "word" is the target label (assign|message|group)
  2. Every following "word" is a 3-tuple containing: two tokens (boolean and METHOD_NAME), with the hash of the path (600457869).

See additional details here: https://github.com/tech-srl/code2vec#extending-to-other-languages

Best, Uri