Closed Avv22 closed 2 years ago
Hi @Avra2 ,
What exactly do you mean by "extract path context"? Do you want the paths? (it's in the raw dataset) These paths' representations? (do you really want ~200 vectors for every example?) The aggregation of these 200 vectors? (this is the "code vector")
Uri
On Sun, Dec 5, 2021 at 3:44 PM Avra @.***> wrote:
Hello,
Given that both your models code2seq and code2vec are initially made to predict method name from source code body represented as path context, can you please give give how to extract path context as we are just looking for source code representation of the source code.
Thank you.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMCTQ5I57YS253EB6T3UPPFJ5ANCNFSM5JNF3EAQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Seems @Avra2 wants to get paths
.
Try to compile Java application inside the project or just find the compiled jar file.
You can run it with java -cp JavaExtractor/JPredict/target/JavaExtractor-0.0.2-SNAPSHOT.jar JavaExtractor.App --max_path_length=8 --max_path_width=2 --file filename
, where filename
is a file with a java function.
Suppose, you have the code:
public boolean f(Set<String> set, String value)
{
for (String entry : set)
{
if (entry.equalsIgnoreCase(value))
{
return true ;
}
}
return false;
}
So, the code will be translated into the following list of paths:
set GenericClass1|Prm|Mth|Bk|Foreach|VDE|VD|VDID0 entry
METHOD_NAME Nm1|Mth|Prm|GenericClass1 set
set GenericClass1|Prm|Mth|Bk|Foreach|VDE|Cls0 string
set VDID0|Prm|GenericClass|Cls0 string
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|VD|VDID0 entry
string Cls0|GenericClass|Prm|Mth|Prm|Cls1 string
set GenericClass|Cls0 string
set GenericClass1|Prm|Mth|Bk|Foreach|Nm1 set
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|Nm1 set
boolean Prim0|Mth|Prm|GenericClass|Cls0 string
METHOD_NAME Nm1|Mth|Prm|GenericClass|Cls0 string
set GenericClass1|Prm|Mth|Prm|VDID0 value
set GenericClass1|Prm|Mth|Bk|Ret|BoolEx0 false
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm3 equals|ignore|case
boolean Prim0|Mth|Prm|GenericClass1 set
string Cls0|GenericClass|Prm|Mth|Prm|VDID0 value
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm2 value
set GenericClass1|Prm|Mth|Bk|Foreach|Bk|If|Cal0|Nm0 entry
set VDID0|Prm|GenericClass1 set
string Cls0|GenericClass|Prm|Mth|Bk|Ret|BoolEx0 false
string Cls0|GenericClass|Prm|Mth|Bk|Foreach|VDE|Cls0 string
set GenericClass1|Prm|Mth|Prm|Cls1 string
Did u want those lists of paths (you called it "path context")?
@lyriccoder @urialon. Thanks for response. By path context, I don't refer to paths extracted by parser but aggregated path learned by your model by the help of attention mechanism as this should be "the most that contributes to method name" please? So can we use this "vector that the most that contributes to method name" as representation embedding for the whole file or this is just useful for your task, which is method name prediction? If this vector can be used for various tasks, then can you please show how to extract it from your network during training?
Thanks.
Did you try this? https://github.com/tech-srl/code2seq#step-4-manual-examination-of-a-trained-model
On Wed, Dec 8, 2021 at 2:53 PM Avra @.***> wrote:
@lyriccoder https://github.com/lyriccoder @urialon https://github.com/urialon. Thanks for response. By path context, I don't refer to paths extracted by parser but aggregated path learned by your model by the help of attention mechanism as this should be the most that contributes to method name please? So can we use this vector that the most that contributes to method name please as representation embeddin for the whole file or this is just useful for your task, which is method name prediction? If this vector can be used for various tasks, then can you please show how to extract it from your network during training?
Thanks.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/110#issuecomment-989146542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMBOGGT33YQQJTXNRP3UP6ZS5ANCNFSM5JNF3EAQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@urialon. Thank you.
Hello,
Given that both your models code2seq and code2vec are initially made to predict method name from source code body represented as path context, can you please give give how to extract path context as we are just looking for source code representation of the source code.
Thank you.