darkestfloyd / PNet

0 stars 1 forks source link

Data #3

Closed darkestfloyd closed 6 years ago

darkestfloyd commented 6 years ago

Look for potential data to be used for training.

darkestfloyd commented 6 years ago

Paper: Learning to Generate Pseudo-code from Source Code using Statistical Machine Translation. The paper describes a SMT based technique to generate pseudo-code from source code. The results of the paper are not very promising though. We could potentially use it to generate more data for training.

darkestfloyd commented 6 years ago

Paper: A parallel corpus of Python functions and documentation strings for automated code documentation and code generation Like the previous paper, this paper describes ways to generate docstring from source code.

varun-sundar-rabindranath commented 6 years ago

https://github.com/EdinburghNLP/code-docstring-corpus

Python - Comments opensource dataset