Feature: `-- plan` flag

swyxio commented 1 year ago

this is an easy one

https://news.ycombinator.com/item?id=35970653
Self-planning Code Generation with Large Language Model https://arxiv.org/pdf/2303.06689.pdf idea
https://news.ycombinator.com/item?id=35735375

swyxio commented 1 year ago

The discovery that only GPT-4 can self-improve, while weaker models cannot, is very intriguing, indicating a new type of emergent ability (i.e. to improve upon natural language feedback) may only exist when the model is "mature" (large and well-aligned) enough https://twitter.com/Francis_YAO_/status/1670618013089820674

Large Language Models (LLMs) have shown remarkable aptitude in code generation but still struggle on challenging programming tasks. Self-repair -- in which the model debugs and fixes mistakes in its own code -- has recently become a popular way to boost performance in these settings. However, only very limited studies on how and when self-repair works effectively exist in the literature, and one might wonder to what extent a model is really capable of providing accurate feedback on why the code is wrong when that code was generated by the same model. In this paper, we analyze GPT-3.5 and GPT-4's ability to perform self-repair on APPS, a challenging dataset consisting of diverse coding challenges.

smol-ai / developer

Feature: `-- plan` flag #12