Closed Avv22 closed 3 years ago
Hi Avra, Thank you for your interest in this work! Sorry for the delayed response.
Have you seen this section of the README? https://github.com/tech-srl/code2vec#exporting-the-code-vectors-for-the-given-code-examples
@urialon. Thank you. So this has to be done either once the model is trained or in case I used a trained model where I can just do prediction?
Correct, the model needs to be trained. Otherwise, vectors are meaningless.
Correct, the model needs to be trained. Otherwise, vectors are meaningless.
Thank you. we have 20k of Java source codes stored in (.java
) format. So we would like to produce embeddings one at a time for all 20k files. So for each file ordered (order is important) we are looking please for one embedding, can you please give direction how to do that with your trained model as I guess you have already published a trained model for Java, so no need to pretrain the model again?
Right, you do not need to retrain the model.
Have you seen this section of the README? https://github.com/tech-srl/code2vec#exporting-the-code-vectors-for-the-given-code-examples
Right, you do not need to retrain the model.
Have you seen this section of the README? https://github.com/tech-srl/code2vec#exporting-the-code-vectors-for-the-given-code-examples
Thank you. I run the train.sh
with 2 datasets train.c2s and test.c2s, but I got the following error during training:
FileNotFoundError: [Errno 2] No such file or directory: 'data/name.dict.c2v'
This file is created during preprocessing. Did you run preprocessing?
On Fri, Nov 5, 2021 at 22:30 Avra @.***> wrote:
Right, you do not need to retrain the model.
Have you seen this section of the README? https://github.com/tech-srl/code2vec#exporting-the-code-vectors-for-the-given-code-examples
Thank you. I run the train.sh with 2 datasets train.c2s and test.c2s, but I got the following error during training:
FileNotFoundError: [Errno 2] No such file or directory: 'data/name.dict.c2v'
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2vec/issues/132#issuecomment-962377398, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMHW4WJNWMBR2757EB3UKSONZANCNFSM5F362BNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
This file is created during preprocessing. Did you run preprocessing? … On Fri, Nov 5, 2021 at 22:30 Avra @.***> wrote: Right, you do not need to retrain the model. Have you seen this section of the README? https://github.com/tech-srl/code2vec#exporting-the-code-vectors-for-the-given-code-examples Thank you. I run the train.sh with 2 datasets train.c2s and test.c2s, but I got the following error during training: FileNotFoundError: [Errno 2] No such file or directory: 'data/name.dict.c2v' — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#132 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMHW4WJNWMBR2757EB3UKSONZANCNFSM5F362BNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hello Doctor,
We did not get that on Python preprocessor via astminer tool. We opened issue here, hopefully you help us with that. On Java, we run the trained model, but as you told us before, we have to use astminer tool to extract AST paths, we did that but the tool does not produce dict.c2v
. Please have a look at our issue above..
OK, so I'm closing this issue and will answer at https://github.com/tech-srl/code2vec/issues/137
Hello,
Thanks for this work. I am trying to get embeddings for a 100 source code files of Java similar to how we use word2vec to get embeddings for document. So I would like please each source file to be represented by let us say 100 embedding vector, so for 100 source code file, we should have 100x100 embeddings. How please to do that with your model?
Thanks.