Open DanielEggelmann opened 5 years ago
Please use an approach or a combination of approaches that we have read about.
Approach: Tokens & Word2Vec Another approach would be to use tokens in combination with word2vec. Every node of the AST would be taken as token and converted to an integer serving as index for the embedding layer. To flatten the tree it would be traversed preorder. Each type of token should be assigned an integer. Variables can dealt with in two ways, a preset set of variables could be used, with a hard coded index for each. Another implementation could be reserving a certain range of indices for variables and then assigning the indices dynmically to variables in the order they appear. These indices are then fed to an embedding layer which converts the indices to vectors, which can be processed by the neural network. Embedding layers are already part of Keras, therefore no additional software components are needed.
Develop a concept to encode AST for the use in neural networks.