Open Avv22 opened 1 year ago
Hey @Avv22 , Thank you for your interest in our work!
Our repository supports only Java and C#. We have a newer model that supports all languages called PolyCoder. Loading it takes only a few lines of code using the Huggingface Transformers library. see:
https://arxiv.org/pdf/2202.13169.pdf https://github.com/VHellendoorn/Code-LMs#october-2022---polycoder-is-available-on-huggingface
Best, Uri
Hey @Avv22 , Thank you for your interest in our work!
Our repository supports only Java and C#. We have a newer model that supports all languages called PolyCoder. Loading it takes only a few lines of code using the Huggingface Transformers library. see:
https://arxiv.org/pdf/2202.13169.pdf https://github.com/VHellendoorn/Code-LMs#october-2022---polycoder-is-available-on-huggingface
Best, Uri
Thank you. I mean if we have a big software of source code in Java. What would be your strategy you decompose the software and give it to your tool please?
Sorry, I don't understand you're question. What is your goal? What are you trying to do?
Sorry, I don't understand you're question. What is your goal? What are you trying to do?
Thank you for your quick reply. I just meant that if we have a complete system. How we can decompose it and pass if to your model so that it predicts the names of blocks inside the system? Do you suggest decomposing the system method-wise and then try to predict a name for each method?
I was trying to use your tool to generate a script (name to tell what software does).
Do you suggest decomposing the system method-wise and then try to predict a name for each method?
Yes, this is basically what our preprocessing pipeline does automatically.
Do you suggest decomposing the system method-wise and then trying to predict a name for each method?
Yes, this is basically what our preprocessing pipeline does automatically.
Thank you very much. You split the code method-based, but can you please show (reference) where you do that in your code? Did you do it by yourself or you used a tool for that?
First, our code goes through all files in the directory: https://github.com/tech-srl/code2vec/blob/master/JavaExtractor/JPredict/src/main/java/JavaExtractor/App.java#L43-L47
Then, I used JavaParser to parse each file in the project, and traverse the resulting AST and extract "method nodes": https://github.com/tech-srl/code2vec/blob/master/JavaExtractor/JPredict/src/main/java/JavaExtractor/FeatureExtractor.java#L39-L49
But that's a very Java-specific pipeline, I wouldn't use the same code for JavaScript.
@urialon. Thank you. Appreciate it. Would you recommend a similar PythonPraser for Python please to extract the method node if possible?
Yes, Our newer project Code2seq has a Python extractor, and the model itself is also much better.
Best, Uri
Yes, Our newer project Code2seq has a Python extractor, and the model itself is also much better.
Best, Uri
Thanks. The Python extractor you developed works similarily to how JavaParser works by extracting method node form python AST source code, please?
It was contributed from the community, so it might be a little different. I think that by default, it was designed to process a specific dataset. However its logic is the same and its output fits the code2seq model.
Best, Uri
Hello Code2Vec team,
Could you please give some hints giving a whole software of code written in different programming languages, how it's possible to apply your tool on it?