tech-srl / code2seq

Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"
http://code2seq.org
MIT License
548 stars 165 forks source link

how to extract context vector? #96

Closed konL closed 3 years ago

konL commented 3 years ago

Hi! @urialon Really impressed by this wonderful work!

In my work, I hope to extract a vector that can represent the semantics meaning of code fragments to calculate their similarities. The Z vector mentioned in this paper is a good representation. For a method body, the vector I want to extract should be the average of all zi.

As mentioned in #81 contexts_average is the vector what I want, could you make it specific?

Thanks again!

urialon commented 3 years ago

Hi @konL , Thank you for your interest in code2seq and for your kind words!

Yes, This https://github.com/tech-srl/code2seq/blob/master/model.py#L414 Is the vector that you need. You can save this variable as a field (self.contexts_average = context_average) and then instantiate it as a numpy array here: https://github.com/tech-srl/code2seq/blob/master/model.py#L96 , just include it in the arguments of sess.run. For example:

_, batch_loss, contexts_average = self.sess.run([optimizer, train_loss, self.contexts_average])

Or in this https://github.com/tech-srl/code2seq/blob/master/model.py#L186 sess.run call, if you are in "evaluate" (test) mode, or in this https://github.com/tech-srl/code2seq/blob/master/model.py#L614 sess.run call if you are in the manual prediction mode.

After calling sess.run with self.contexts_average as an argument, you'll get a contexts_average variable that is a numpy array of shape (batch_size, dim * 2 + rnn_size) (I think).

Best, Uri

konL commented 3 years ago

Yes, it works! Thank you for your kind help for a new student🙏! Now I will continue my work and close the issue Thanks again!

urialon commented 3 years ago

Great, I'm glad to hear :-)

brcsnt commented 2 years ago

Hello,

Thanks for the Code2Seq work you shared.

I'm also trying to find the code vector for a given java code. I tried to do as @urialon mentioned, but I could not get a complete result. @konL is it possible for you to help me?

Thanks again,