Closed bubundas17 closed 2 years ago
Hello. Thanks for the interest in our work. flax-community/gpt-neo-1.3B-apps-all
is not the right model you are looking for. The above used model is a fine-tuned version of GPT-Neo on APPS Dataset. It is a competitive programming style code dataset for evaluation. We did that for making the demo for the event.
Although majority of our work evolved over scrapping code from GitHub. Please try using gpt-code-clippy-125M-1024-f
, which was trained with Causal Objective on our dataset we scrapped from GitHub.
Thanks again for pointing it out. We'll update our README.md
with proper redirection to appropriate models.
Feel free to reproduce the experiment over here - https://colab.research.google.com/drive/1SEvl7xR48FdDdn75cbS9FiRF6Gd0QgXg?usp=sharing
Yes, you are right. I got it from the demo app. flax-community/gpt-neo-1.3B-apps-all This dataset was commented out in the demo app source code.
Is there any 1.3B version of gpt-code-clippy-125M-1024-f? I'd like to try out that too.
And what are the future plans with this project? Will it just stay as a research paper? Or you guys are planning to publish a competitive product like GitHub copilot?
Some thoughts: Running the 1.3B version at fast enough speeds requires a good amount of processing power. Will it be viable to create GitHub copilot like service using this dataset?
@bubundas17 another thing you could try if you want to use the finetuned 1.3B model is to modify your prompt to the model to be more inline with its training data. For your example you could try using this helper function which we use in our demo to format the code correctly:
def format_input(question, starter_code=""):
answer_type = (
"\nUse Call-Based format\n" if starter_code else "\nUse Standard Input format\n"
)
return f"\nQUESTION:\n{question}\n{starter_code}\n{answer_type}\nANSWER:\n"
where the question
parameter is your doc string and the starter_code
is the start of your method definition
@bubundas17 another thing you could try if you want to use the finetuned 1.3B model is to modify your prompt to the model to be more inline with its training data. For your example you could try using this helper function which we use in our demo to format the code correctly:
def format_input(question, starter_code=""): answer_type = ( "\nUse Call-Based format\n" if starter_code else "\nUse Standard Input format\n" ) return f"\nQUESTION:\n{question}\n{starter_code}\n{answer_type}\nANSWER:\n"
where the
question
parameter is your doc string and thestarter_code
is the start of your method definition
Yes, I was already using this function. I picked up the code from the web demo.
Ah okay I think I understand the reason now it was generating such nonsense for you. The APPS model was trained purely on python data and so trying to feed it the javascript code you have will cause it to behave strangely. In that case definitely try just using our 125M model or stick with EleutherAI's 1.3B for now until we get around to finetuning one of that size on pure github data. You can also try out EleutherAI's GPT J that has 6B parameters: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b. It does an even better job at code generation
Ah okay I think I understand the reason now it was generating such nonsense for you. The APPS model was trained purely on python data and so trying to feed it the javascript code you have will cause it to behave strangely. In that case definitely try just using our 125M model or stick with EleutherAI's 1.3B for now until we get around to finetuning one of that size on pure github data. You can also try out EleutherAI's GPT J that has 6B parameters: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b. It does an even better job at code generation
Ahh 😅😅
Guys, another thing I am curious about. If you guys only train the model purely with GitHub Data, I guess it won't have much understanding in English language.
Then how will it understand the context? (I.E the commented text above function)
Ah okay I think I understand the reason now it was generating such nonsense for you. The APPS model was trained purely on python data and so trying to feed it the javascript code you have will cause it to behave strangely. In that case definitely try just using our 125M model or stick with EleutherAI's 1.3B for now until we get around to finetuning one of that size on pure github data. You can also try out EleutherAI's GPT J that has 6B parameters: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b. It does an even better job at code generation
Ahh 😅😅
Guys, another thing I am curious about. If you guys only train the model purely with GitHub Data, I guess it won't have much understanding in English language.
Then how will it understand the context? (I.E the commented text above function) I guess the answer lies in the nature of the Modelling. 1) The base model used is GPT-Neo which has seen a lot of NL text alongside some code refer PILE Dataset. 2) While the Dataset is GitHub Code, it has also seen the comments alongside the code. That's why a clean prompt-design which is coherent with the training data gives a better outcome. 3) The probability of having good comments with the criteria of filtering in GitHub is high. One such choice is that we filtered repository with high stars, hypothesis being " popular repositories have high quality comments". I hope this answers your question. Thanks.
Closing this issue for now, if you'd like to discuss this more feel free to reopen, but a better form for in-depth discussion would be on our discord!
Hi, You guys are doing a great job with it.
I have tried your flax-community/gpt-neo-1.3B-apps-all model, and the generated code is kinda hit or miss.
This is generated using flax-community/gpt-neo-1.3B-apps-all
and this is generated using EleutherAI/gpt-neo-1.3B
as far I know EleutherAI/gpt-neo-1.3B is trained on more generalized texts, which are not necessarily code.
then why flax-community/gpt-neo-1.3B-apps-all performing much worse than EleutherAI/gpt-neo-1.3B?