Closed shreyasingh closed 4 years ago
Hi @shreyasingh , thank you for your interest in code2vec!
git clone
as much data as you wish from public repositories.Best, Uri
Hi @shreyasingh, I'm closing this due to inactivity, feel free to re-open, or open another issue, if you have any further questions.
Good luck with your internship! Uri
Hi, I'm working on a Neural Code Search prototype and am currently in the process of doing literature survey on code embeddings. I came across your code2vec paper and I must say that it is a very well-written paper and I enjoyed reading it!
I was going through the github code for code2vec (https://github.com/tech-srl/code2vec) and wanted to train the model for C# files. Could you let me know if:
There is an existing dataset for C# source code files - the same way as you've published for Java on Github. The dataset could be unprocessed, and I could process it with preprocess_csharp.sh. I also saw that you've published the token and method name embeddings which are available to download. Is there a similar token and method name embedding file available for C# language tokens that I could directly load and use? That way I will not need to train the model. Your advice and guidance on the above two items would be highly appreciated.