Open lpietrobon opened 1 year ago
I would add to that and propose to create something like .filename_extension for large files, where we would input a short description of the file or what dose it do. and takes precedence over reading the actual file. Also maybe we can create a .ignore file with things that are not worth reading by the LLM.
I'm not sure whether to include this here or split it out into another issue... large numbers of files in the tree also cause problems separate from whether the LLM can hold their contents:
tl;dr:
OSError: [Errno 7] Argument list too long: 'git'
Full call stack:
Traceback (most recent call last):
File ".../envs/Mentat/bin/mentat", line 8, in <module>
sys.exit(run_cli())
^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 24, in run_cli
run(paths)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 35, in run
loop(paths, cost_tracker)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 46, in loop
code_file_manager = CodeFileManager(paths, user_input_manager, config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/code_file_manager.py", line 181, in __init__
self._set_file_paths(paths)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/code_file_manager.py", line 221, in _set_file_paths
repo.ignored(*all_files)
File ".../envs/Mentat/lib/python3.11/site-packages/git/repo/base.py", line 877, in ignored
proc: str = self.git.check_ignore(*paths)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/git/cmd.py", line 741, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/git/cmd.py", line 1315, in _call_process
return self.execute(call, **exec_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/git/cmd.py", line 985, in execute
proc = Popen(
^^^^^^
File ".../envs/Don-Mentat/lib/python3.11/subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File ".../envs/Don-Mentat/lib/python3.11/subprocess.py", line 1950, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: 'git'
Good news: If you point mentat on launch at a folder within a larger project as a way of reducing the token and file count, it's able to understand the set of files in the folder and propose diff's to implement your requests.
Bad news: After proposing the changes, mentat is not able to apply the diffs, possibly because mentat thinks the root of the git project is the folder it was launched in but git treats the ancestor containing the .git folder as the project root?
% mentat path/to/folder/deep/inside/repo
>>> make some changes
Apply these changes? 'Y/n/i' or provide feedback.
Y
Traceback (most recent call last):
File ".../envs/Mentat/bin/mentat", line 8, in <module>
sys.exit(run_cli())
^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 24, in run_cli
run(paths)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 35, in run
loop(paths, cost_tracker)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 61, in loop
need_user_request = get_user_feedback_on_changes(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/app.py", line 144, in get_user_feedback_on_changes
code_file_manager.write_changes_to_files(code_changes_to_apply)
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/code_file_manager.py", line 353, in write_changes_to_files
new_code_lines = self._get_new_code_lines(changes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/code_file_manager.py", line 312, in _get_new_code_lines
if new_code_lines != self._read_file(rel_path):
^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../envs/Mentat/lib/python3.11/site-packages/mentat/code_file_manager.py", line 249, in _read_file
with open(abs_path, "r") as f:
^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'local/path/from/root/of/repo/to/subdir/modified-file.foo'
Workaround: Manually applying the diffs in a text editor seems to work for now. Not ideal, but I'm still super excited to be using mentat.
@Luke-in-the-sky , @alexut those are both great ideas. We are thinking about the best ways to work with bigger codebases - make sure you join the discord where most of that discussion will take place, as this is a big feature.
OSError: [Errno 7] Argument list too long: 'git'
@yoDon ah yes, we'll fix this bug asap
After proposing the changes, mentat is not able to apply the diffs, possibly because mentat thinks the root of the git project is the folder it was launched in but git treats the ancestor containing the .git folder as the project root?
@yoDon ah, this is probably another bug. this is what your PR address I guess?
Hi @biobootloader, https://github.com/biobootloader/mentat/pull/8 should help with the large project issues @Luke-in-the-sky and others were having, and https://github.com/biobootloader/mentat/pull/7 fixed the pathing issue I was seeing.
If we incorporate all the codebase in the prompt sent to the LLM, we might overflow the max length allowed by the LLM. So if the codebase is large, maybe we want to add a retrieval step where we identify the most relevant pieces of code given the user instructions and only add these relevant ones to the LLM input