microsoft / tf-gnn-samples

TensorFlow implementations of Graph Neural Networks
MIT License
914 stars 229 forks source link

Question on building the subtoken nodes #21

Open shangqing-liu opened 2 years ago

shangqing-liu commented 2 years ago

Hello

I have tried read the varmisuse implementation provided by varmisuse_task.py, however I faced one question on building the subtoken nodes, see the following line https://github.com/microsoft/tf-gnn-samples/blob/73e2c950736ac7f662fa88c03c9c0c45fe29d65f/tasks/varmisuse_task.py#L51, it seems that the node label wants to skip AST nodes and punctuation, however when I check the unsplittable_node_names read from the c_sharp.txt from dpu_utils.codeutils.get_language_keywords, the AST nodes are not involved in, which leads to the constructed edge contains the AST nodes. According to my understanding, the AST nodes should not involve in building the subtoken edge, is it correct?

Thanks