oleges1 / code-completion

pytorch version of code completion with neural attention and pointer networks
MIT License
12 stars 6 forks source link

Segments within each program have no interactions #1

Open Asichurter opened 3 years ago

Asichurter commented 3 years ago

In the original paper, it is said that each program is divided into many short segments (len=50) to feed into model respectively and they are interacted by reusing the hidden states and memory states from previous segment as initial states. However, in this implementation, two points do not fit this description:

  1. Segments from the same program are scattered in the data list by sorting with their lengths (maybe for reducing padding EOF I guess ?) during data loading and they may be even not in a single batch together during forward. How can they interact with each other without existing in the same batch?
  2. The initial values of hs and hc are always set to default value in MixturePointer (ones and None respectively), they are never using previous states as initial values, hence no interactions between segments of the same program.

This is only my own question for this implementation. Thanks for any explainations or replies.

oleges1 commented 3 years ago

Hi! Thank you for your issue! I didn't found this in original code and in paper, when I worked on this repo, but authors mentioned this: "We divide each program into segments consisting of 50 consecutive AST nodes, with the last segment being padded with EOF if it is not full. The LSTM hidden state and mem- ory state are initialized with h0, c0, which are two trainable vectors. The last hidden and memory states from the previ- ous LSTM segment are fed into the next one as initial states if both segments belong to the same program. Otherwise, the hidden and memory states are reset to h0, c0."

In my implementation h0 and c0 are always just set to default values (ones as I remember).

I'm not pretty sure if this helps to improve performance, but you can try to fix this issue. You need to pay attention on data preparation and training code.

It would be great if you make a pull request with fix.