gpt-engineer-org / gpt-engineer

Specify what you want it to build, the AI asks for clarification, and then builds it.
MIT License
51.34k stars 6.68k forks source link

Constantly Generates Incomplete Vulnerable Code #950

Closed hackdefendr closed 6 months ago

hackdefendr commented 6 months ago

Expected Behavior

Based on the description of GPT Engineer I expected complete code that actually runs. Additionally, I would expect GPT Engineer to not use old vulnerable code libraries when developing the app. This part is actually described in my prompt.

Current Behavior

GPT Engineer has not in 10 tests written a single piece of working code for me yet. Sure the backend web server starts, but it is running code versions that have Critical CVE's associated.

Failure Information

The crash only happens at the end of an IMPROVE execution using the -i switch. Basically saying it can't parse the files that it created.

│   145 │                                                                                          │
│   146 │   for line in chat.split("\n"):                                                          │
│   147 │   │   if line.startswith("```") and in_fence:                                            │
│ ❱ 148 │   │   │   edits.append(parse_one_edit(current_edit))                                     │
│   149 │   │   │   current_edit = []                                                              │
│   150 │   │   │   in_fence = False                                                               │
│   151 │   │   │   continue                                                                       │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │           chat = 'I apologize for the confusion. Since the request is to develop an entire   │ │
│ │                  applica'+1758                                                               │ │
│ │   current_edit = []                                                                          │ │
│ │          edits = []                                                                          │ │
│ │       in_fence = True                                                                        │ │
│ │           line = '```'                                                                       │ │
│ │ parse_one_edit = <function parse_edits.<locals>.parse_one_edit at 0x103ef2980>               │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /opt/homebrew/lib/python3.11/site-packages/gpt_engineer/core/chat_to_files.py:134 in             │
│ parse_one_edit                                                                                   │
│                                                                                                  │
│   131 │   │   text = "\n".join(lines)                                                            │
│   132 │   │   splits = text.split(DIVIDER)                                                       │
│   133 │   │   if len(splits) != 2:                                                               │
│ ❱ 134 │   │   │   raise ValueError(f"Could not parse following text as code edit: \n{text}")     │
│   135 │   │   before, after = splits                                                             │
│   136 │   │                                                                                      │
│   137 │   │   before = before.replace(HEAD, "").strip()                                          │
│                                                                                                  │
│ ╭─────────────────────── locals ───────────────────────╮                                         │
│ │  DIVIDER = '\n=======\n'                             │                                         │
│ │ filename = 'new file: project/html/card-editor.html' │                                         │
│ │     HEAD = '<<<<<<< HEAD'                            │                                         │
│ │    lines = []                                        │                                         │
│ │   splits = ['']                                      │                                         │
│ │     text = ''                                        │                                         │
│ │   UPDATE = '>>>>>>> updated'                         │                                         │
│ ╰──────────────────────────────────────────────────────╯                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Could not parse following text as code edit:

Steps to Reproduce

  1. Clone repository
  2. Install with Poetry
  3. Create project file and detailed prompt.
  4. Run: gpte projects/

Additional Notes

I'm trying to build a functional web app and all I get is incomplete files with a lot of comments for code placeholders. Additionally, GPT Engineer seems to like old vulnerable code when developing. If I ever published anything built so far with GPT Engineer I'd be hacked in seconds. I'm not bashing GPTE, I really think there is potential here, but as a security expert I cannot promote this yet.

ATheorell commented 6 months ago

Partly, read my answer to #946 Secondly, for web-stuff, you may be luckier with https://gptengineer.app/, though I know the waiting list is long atm.

hackdefendr commented 6 months ago

What about the constant usage of old vulnerable code libraries?

That isn't ChatGPT, because I tested it side by side with the same prompt, same model, and got different results with GPT Engineer always defaulting to using old and very dangerous versions of code even when told not to. When I specified to use the newest code versions with no publish vulnerabilities to ChatGPT I got the latest versions of code and links to where it verified vulnerabilities.

Something isn't working in GPT Engineer.

ATheorell commented 6 months ago

gpt-engineer gets all written code from chat-gpt. Before asking chat-gpt for the code it adds some pre-prompts, asking chat-gpt to make the implementation as complete and functional as possible. It is possible that, when doing so, chat-gpt down prioritizes using the newest version, but this is just speculation on my part. How the prompt to chat-gpt affects safety awareness is for sure incredibly complex. You can try adding something like "make sure to use the newest library versions to avoid safety vulnerabilities" to the prompt.

hackdefendr commented 6 months ago

I guess you missed the part where I said I already added commands in my prompt to use new software with no vulnerabilities, but it doesn't work in GPT Engineer and does work on ChatGPT. I understand the desire to defend your code, but, seriously if GPT Engineer is choosing to ignore my prompt and then pulls down old vulnerable code despite what its told, then this isn't a ChatGPT issue.

ATheorell commented 6 months ago

Sorry for overlooking that you already tested my suggestion.

I'm not primarily defending gpt-engineer, but giving you context on what gpt-engineer is doing, so that you can judge its behavior better. I repeat: the only thing gpt-engineer does different from using chatgpt directly, is adding additional instructions to the chatgpt user prompt. These instructions are probably the reason why your reported problem occurs, which I was open about in my answer. How to change these instructions to be more security aware, without harming perfomance in less security critical applications is an open problem. If you have a suggestion on how to do this, your help is more than welcome.