Closed sryu1 closed 1 year ago
@sryu1 thanks for the suggestion.
I doubt this would be possible as this probably will be extremely hard and quite different to Open-Assitant
Could you please elaborate on this?
I'm going to classify this as data
. I think the task framework that's being developed will account for this. The way to solve this is to find some viable prompt + response datasets that cover coding type interactions. The datasets
channel in discord is a good place to discuss this.
@fozziethebeat Agreed. If the model and data(of that task) are large enough, scaling papers say that the model will learn the differentiation of tasks well from the prompts. That being said, I have seen from experience that if the training framework can is set with special tokens for particular tasks(code gen here), the results are phenomenally better.
P.S. The LaMDA paper supports this as well.
If you believe their website, then OpenAI's new davinci models are actually first trained on code, therefore the final model is quite proficient. I'd advocate that we also start from a code model, maybe something like CodeGen
https://github.com/fauxpilot/fauxpilot Faux pilot runs on codegen + nvidia triton for fast inference (it uses fasttransformer to convert codegen to TensortRT model format).
Here is a 10k feet view of how copilot works which is likely to be scribing similarly to chatgpt on this topic
https://thakkarparth007.github.io/copilot-explorer/posts/copilot-internals.html
Also, it seems that chatgpt if performing multiple code candidates (like 10), checks jit compilation to see how the code is likely to be runnable and elect a candidate.
Hey all, excited to start contributing to this project. +1 to start with code models. It seems like code models would enable the understanding of long range logical dependencies better. Although, I haven't seen any concrete literature, it is just a conjecture. It does make sense though.
@dhruv2601 instead of a special tag, we could use answer prefix that primes the output to be code. "Here is some code to do XYZ" where XYZ was the request. Literally, providing prefix text such as "Here is some code to do" or varations like that. It would be interesting to see if this helps performance in multi-task fashion at all, esp, since we are using much smaller models than instructgpt.
I think training on the code should be included in this project as a task. The model behind ChatGPT is also based on code-davinci-002, which is trained on code generation and text as well. There is some evidence shows that the inference and chain of though may results in code task.
I suspect that an absolutely amazing dataset can be created by having an open source co-pilot or www.codeium.com.
The dataset doesn't need to be people's code, instead it just needs to be whether or not the user accepted or rejected the completion that was provided to them.
Not sure how best to do this in a privacy preserving way. But I suspect this is such a ginormous gold mine for and underlying training dataset that any ChatGPT clone that isn't using a method like this will struggle to be able to match other versions in their capacity to seamlessly talk to the APIs of the world and other tasks that having the AI be able to write accurate software achieves.
If anything... In my mind... This is a dataset creation pipeline that almost needs to be created first...
Today i thought of a cognitive architecture with a code generator model, then the code is tested if it runs (current Open Assistant code mostly don't) and code review ones that e.g. check if python code follows PEP8 and PEP20.
This can be used for 'RL from AI Feedback' (RLAIF) to improve the model.
I wonder if it was trained on all of github code, if it would produce good code or garbage.... Maybe a subset of well known repositories would be better.
Just a suggestion that I thought yesterday, maybe we could have a sister project to this one, where it's mostly trained on code, and it can respond back with good code. (A bit like chat gpt when asking it to code for us but it'll be better hahaha) Just a suggestion, I doubt this would be possible as this probably will be extremely hard and quite different to Open-Assitant