Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit

vessenes commented 4 months ago

Issue

Asking for a large-scale change with sonnet, I see this output:

Model claude-3-5-sonnet-20240620 has hit a token limit!

Input tokens: 4902 of 200000 Output tokens: 3220 of 4096 Total tokens: 8122 of 200000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

None of these numbers is over the stated limit. However, here is what my anthropic API console returns:

Jun 21, 2024 at 7:56 PM claude-3-5-sonnet-20240620 39.10 5068 4096 sse None

The "None" is in the "Error" column. The 4096 is tokens generated. So, it looks like we actually got 4096 tokens out of Anthropic, but either we're using the wrong tokenizer for aider, or ... ?

Version and model info

Aider v0.39.0, called with aider --sonnet

vessenes commented 4 months ago

p.s. I note the input token calculation report is wrong from aider as well.

paul-gauthier commented 4 months ago

Thanks for trying aider and filing this issue. Others have reported similar issues recently. Your confirmation that sonnet did indeed return 4k tokens is very helpful info.

NeuralAIM commented 4 months ago

Same with DeepSeek Coder

Emasoft commented 4 months ago

I can confirm this. Can't work with Sonnet 3.5 because it stops after a while and prints an error about tokens limit reached, even though the limit was not reached.

paul-gauthier commented 4 months ago

It is reaching the output limit. The token counts being shown aren't accurate because Anthropic hasn't released a tokenize for these models. But Anthropic itself is returning the token limit error.

paul-gauthier commented 4 months ago

4k tokens is like ~hundreds of lines of code. Are you guys routinely asking for a single change that involves hundreds of lines of code?

I understand that a refactor might tend in this direction. I can certainly concoct a situation to force this to happen.

But I'd really love if folks could share some example output from .aider.chat.history.md that shows this problem happening naturally "in the wild".

Yona544 commented 4 months ago

======= procedure TfrmMasterDetail.SaveSQL1Click(Sender: TObject); var DBGrid: TDBGrid; MenuItem: TMenuItem; Query: TFDQuery; JSONFile: string; JSONArray,

Model claude-3-5-sonnet-20240620 has hit a token limit!

Input tokens: 9790 of 200000 Output tokens: 3037 of 4096 Total tokens: 12827 of 200000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html Did not ask for hundreds of lines of code.

see file .aider.chat.history.md

paul-gauthier commented 4 months ago

Thanks @Yona544, I really appreciate that. Unfortunately it appears that the model output doesn't get saved to .aider.chat.history.md when this error occurs! I have updated the main branch to improve logging for this.

Emasoft commented 4 months ago

@paul-gauthier

4k tokens is like ~hundreds of lines of code. Are you guys routinely asking for a single change that involves hundreds of lines of code?

I don't understand your surprise at hearing that we reach the 4k limit so often. If a project is not just a small script, I usually reach that limit every 3-4 exchanges with the AI. And yes, a single change usually needs to modify hundreds of lines of code. For example yesterday I found a bug caused by a function that needed 4 parameters to work correctly but instead, it was only accepting 2 arguments (the error was to assume those 2 additional parameters were constants, instead, they were not). So aider changed the function definition from 2 to 4 arguments. But (and this is the main point) it also had to change all the places in the code that were calling that function, ensuring that each time 4 variables were correctly instanced and passed instead of only 2. Since that was a function used very often in the program, the code changes were easily above 4k tokens, and probably much more than that. That scenario is a common occurrence when fixing bugs.

Anyway, I found a TEMPORARY WORKAROUND until this is fixed.

I created a new custom configuration file called sonnet_cfg.json with a reduced token limit (3072 instead of 4096). I made many tests, and found that this is the maximum value you can use to compensate the error of estimation made by aider (a buffer of 1024 tokens). This is the full content of the file:

 {
   "claude-3-5-sonnet-20240620": {
        "max_tokens": 3072,
        "max_input_tokens": 200000,
        "max_output_tokens": 3072,
        "input_cost_per_token": 0.000003,
        "output_cost_per_token": 0.000015,
        "litellm_provider": "anthropic",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": true,
        "tool_use_system_prompt_tokens": 159
    },
   "claude-3-haiku-20240307": {
        "max_tokens": 4096,
        "max_input_tokens": 200000,
        "max_output_tokens": 4096,
        "input_cost_per_token": 0.00000025,
        "output_cost_per_token": 0.00000125,
        "litellm_provider": "anthropic",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": true,
        "tool_use_system_prompt_tokens": 264
    }
}

All you need to do is to create the above file, and then launch Aider with this command:

aider --model-metadata-file sonnet_cfg.json --model claude-3-5-sonnet-20240620 --weak-model claude-3-haiku-20240307 --edit-format diff --map-tokens 1024 --restore-chat-history

Note that you MUST use the ‘diff’ edit format for this trick to be effective. It is a temporary solution, but It works for me. I never got the token limit error again.

PS: You can remove “--restore-chat-history” from the command if you want to save tokens ( it doesn’t work anyway...)

paul-gauthier commented 4 months ago

Modifying the model metadata doesn't affect whether a token limit is hit. Aider doesn't use the max_output_tokens value for anything (except printing it). Aider doesn't enforce the token limit. Anthropic's API server returns the token limit error because the underlying LLM has errored out.

All the OpenAI models and Opus have the same 4k output token limit. But literally no one has ever reported this output token limit issue previously until Sonnet launched.

I have confirmed that Sonnet is really "chatty" with its SEARCH/REPLACE blocks. It often includes a ton of unchanging code in both the search and replace sections.

For example, I made a code request to change 2 lines spread apart in a large source file. Sonnet made a SEARCH/REPLACE block that included all the intervening lines!

I've updated the prompts to disuade Sonnet from this behavior. The change is available in the main branch.

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

I would really appreciate feedback from anyone who is able to try this improved version.

Emasoft commented 4 months ago

Modifying the model metadata doesn't affect whether a token limit is hit. Aider doesn't use the max_output_tokens value for anything (except printing it). Aider doesn't enforce the token limit. Anthropic's API server returns the token limit error because the underlying LLM has errored out.

Ok, but no matter what you say, my trick works. If I remove it and revert to the 4096 value, I get the token limit error again. Why? IMO all Aider needs to do for my trick to work is to pass the custom model configuration to LiteLLM. And it does. And that is all we need to avoid the limit error. Because then LiteLLM will use the max_tokens value when calling the completion API of Anthropic, and that will result in fewer tokens sent back by Anthropic as a response. Maybe Sonnet will ‘regulate’ itself by answering with 3072 tokens instead of 4096, I don’t know. That will be enough to prevent LiteLLM or Aider from (over)estimating (due to the wrong tokenizer, or whatever the issue behind this bug) a response length greater than 4096.

Here is Aider passing the custom model to Litellm with register_model():

https://github.com/paul-gauthier/aider/blob/7be08c782c91b5ba5491f40eafde131689edf9a7/aider/models.py#L508

...and here is LiteLLM register_model() overriding the max_tokens original parameter:

https://github.com/BerriAI/litellm/blob/769c60e4f6e6e98e91e0619b5c20906d1c43cb20/litellm/utils.py#L1927

I would really appreciate feedback from anyone who is able to try this improved version.

I will test it as soon as possible.

Yona544 commented 4 months ago

I'm using the new version and got

Model claude-3-5-sonnet-20240620 has hit a token limit!
Token counts below are approximate.

Input tokens: ~15,227 of 200,000
Output tokens: ~2,991 of 4,096 -- possibly exceeded output limit!
Total tokens: ~18,218 of 200,000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

.aider.chat.history.md

Hope this is helpful in resolving issue

In a case like this, I go back and ask Aider to break down the task and do one part at a time maybe if Aider can detect this token hit, it can automatically re-prompt to generate a list of actions and feed a action at a time till done

Emasoft commented 4 months ago

I also tested the new version (Aider v0.39.1-dev), and the problem persists:

Model claude-3-5-sonnet-20240620 has hit a token limit!  
Token counts below are approximate.  

Input tokens: ~24,889 of 200,000  
Output tokens: ~3,218 of 4,096 -- possibly exceeded output limit!  
Total tokens: ~28,107 of 200,000  

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

This is the chat transcript:

Aider.chat.history.zip

Hope this helps.

Yona544 commented 4 months ago

@Emasoft , when using with the sonnet_cfg.json fife as u described, I'm getting :

(aider) Y:\Projects\posexport>aider --model-metadata-file sonnet_cfg.json --model claude-3-5-sonnet-20240620 --weak-model claude-3-haiku-20240307 --edit-format diff --map-tokens 1024 --restore-chat-history

Loaded 1 litellm model file(s)

sonnet_cfg.json Aider v0.39.1-dev Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307 Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 27 files Repo-map: using 1024 tokens Restored previous conversation history.

summarizer unexpectedly failed for claude-3-haiku-20240307

Emasoft commented 4 months ago

@Yona544 No idea. Works perfectly fine for me. Have you tried omitting --restore-chat-history? If your previous model configuration was much different, it may not be restored correctly.

paul-gauthier commented 4 months ago

I just released v0.40.0 which has even stronger prompting to stop Sonnet from wasting so many tokens on unchanging chunks of code. If you have a chance to upgrade and try it, please let me know how it works for you.

thiswillbeyourgithub commented 4 months ago

I don't know if this is helpful but : although anthropic has not released a tokenizer, they still gave access to a token counting function.

As per their chatbor:

import anthropic

vo = anthropic.Anthropic()
total_tokens = vo.count_tokens(["Sample text"])

(I'm thinking this can be used to reverse engineer the tokenizer so maybe look for that on github)

Emasoft commented 4 months ago

@paul-gauthier I tested the 0.40.0 and this is the result:

Model claude-3-5-sonnet-20240620 has hit a token limit!  
Token counts below are approximate.  

Input tokens: ~28,911 of 200,000  
Output tokens: ~3,277 of 4,096 -- possibly exceeded output limit!  
Total tokens: ~32,188 of 200,000  

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

aider.chat.history40.md.zip

It seems that the issue was not resolved. Hope this helps.

cngarrison commented 4 months ago

I'm running 0.40.0 and got a token limit error today.

I'd like to propose a different solution, which has worked for me in other cases (automated code) where I hit the output token limit. Get Claude to "keep speaking". Send another request, with the last message an "assistant" role with contents of previous response. Claude will pick up where he left off from before. I use this technique to get complete XML blocks that are well over 4k tokens. Claude handles it very well.

Here is a simple example:

https://www.reddit.com/r/Anthropic/comments/1dgpopb/how_can_we_tell_claude_to_continue_generating_the/

The docs discuss prefilling the assistant response as a supported technique, but for different use-cases:

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response

So, getting Claude to "keep speaking" would allow for code search/replace blocks well in excess of the 4k token limit.

ackoi commented 4 months ago

I'm running 0.40.0 and got a token limit error today.

I'd like to propose a different solution, which has worked for me in other cases (automated code) where I hit the output token limit. Get Claude to "keep speaking". Send another request, with the last message an "assistant" role with contents of previous response. Claude will pick up where he left off from before. I use this technique to get complete XML blocks that are well over 4k tokens. Claude handles it very well.

Here is a simple example:

https://www.reddit.com/r/Anthropic/comments/1dgpopb/how_can_we_tell_claude_to_continue_generating_the/

The docs discuss prefilling the assistant response as a supported technique, but for different use-cases:

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response

So, getting Claude to "keep speaking" would allow for code search/replace blocks well in excess of the 4k token limit.

I agree with this, this is what I recommended yesterday as well: https://github.com/paul-gauthier/aider/issues/716

paul-gauthier commented 4 months ago

It may indeed be useful to add some "continue" functionality. But the root cause of the problem is that Sonnet has been wasting thousands of tokens by outputting tons of unchanging code. It's important to address that root cause first, before assessing the need for additional workarounds.

Yona544 commented 4 months ago

Model claude-3-5-sonnet-20240620 has hit a token limit! Token counts below are approximate.

Input tokens: ~24,884 of 200,000 Output tokens: ~3,027 of 4,096 -- possibly exceeded output limit! Total tokens: ~27,911 of 200,000

Aider v0.40.1 Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307 Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 28 files Repo-map: using 1024 tokens

.aider.chat.history.md

Emasoft commented 4 months ago

I don't think that any prompt would make Sonnet output less code. Or at least not without negative consequences. Maybe the reason it's good at coding is the same reason why it outputs more code. Perhaps rewriting part of the code helps it be more accurate when predicting the next token. Token prediction is based on what precedes it after all. Maybe the best solution would be to implement the "continue" trick.

vadi2 commented 4 months ago

I have to agree, when using Sonnet outside of aider it's a breath of fresh air when it returns the entire snippet of code instead of just giving you a diff.

cngarrison commented 4 months ago

It may indeed be useful to add some "continue" functionality. But the root cause of the problem is that Sonnet has been wasting thousands of tokens by outputting tons of unchanging code. It's important to address that root cause first, before assessing the need for additional workarounds.

If you can get Sonnet to be more terse, that's a good quick win.

But the problem of exceeding the output token limit will always be there, even if it doesn't happen as frequently (eg. I asked aider for a new file today, there was no diffing to make it smaller). Getting claude to continue speaking will solve that.

If you can suggest which file you think the "loop & continue" logic belongs in, I can create a PR for it. (Python isn't my native.)

unphased commented 4 months ago

I'm hitting the token limit due to trying to do too many things in one go, but usually the issue is sonnet goes to include too much context.

Example I just hit:

src/config.rs

 <<<<<<< SEARCH
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub blame_separator_format: BlameLineNumbers,
     pub blame_palette: Vec<String>,
     pub blame_separator_style: Option<Style>,
     pub blame_timestamp_format: String,
     pub blame_timestamp_output_format: Option<String>,
     pub color_only: bool,
     pub commit_regex: Regex,
     pub commit_style: Style,
     pub cwd_of_delta_process: Option<PathBuf>,
     pub cwd_of_user_shell_process: Option<PathBuf>,
     pub cwd_relative_to_repo_root: Option<String>,
     pub decorations_width: cli::Width,
     pub default_language: String,
     pub diff_stat_align_width: usize,
     pub error_exit_code: i32,
     pub file_added_label: String,
     pub file_copied_label: String,
     pub file_modified_label: String,
     pub file_removed_label: String,
     pub file_renamed_label: String,
     pub file_regex_replacement: Option<RegexReplacement>,
     pub right_arrow: String,
     pub file_style: Style,
     pub git_config: Option<GitConfig>,
     pub git_minus_style: Style,
     pub git_plus_style: Style,
     pub grep_context_line_style: Style,
     pub grep_file_style: Style,
     pub classic_grep_header_file_style: Style,
     pub classic_grep_header_style: Style,
     pub ripgrep_header_style: Style,
     pub grep_line_number_style: Style,
     pub grep_match_line_style: Style,
     pub grep_match_word_style: Style,
     pub grep_output_type: Option<GrepType>,
     pub grep_separator_symbol: String,
     pub handle_merge_conflicts: bool,
     pub hunk_header_file_style: Style,
     pub hunk_header_line_number_style: Style,
     pub hunk_header_style_include_file_path: HunkHeaderIncludeFilePath,
     pub hunk_header_style_include_line_number: HunkHeaderIncludeLineNumber,
     pub hunk_header_style_include_code_fragment: HunkHeaderIncludeCodeFragment,
     pub hunk_header_style: Style,
     pub hunk_label: String,
     pub hyperlinks_commit_link_format: Option<String>,
     pub hyperlinks_file_link_format: String,
     pub hyperlinks: bool,
     pub inline_hint_style: Style,
     pub inspect_raw_lines: cli::InspectRawLines,
     pub keep_plus_minus_markers: bool,
     pub line_buffer_size: usize,
     pub line_fill_method: BgFillMethod,
     pub line_numbers_format: LeftRight<String>,
     pub line_numbers_style_leftright: LeftRight<Style>,
     pub line_numbers_style_minusplus: MinusPlus<Style>,
     pub line_numbers_zero_style: Style,
     pub line_numbers: bool,
     pub styles_map: Option<HashMap<style::AnsiTermStyleEqualityKey, Style>>,
     pub max_line_distance_for_naively_paired_lines: f64,
     pub max_line_distance: f64,
     pub max_line_length: usize,
     pub merge_conflict_begin_symbol: String,
     pub merge_conflict_ours_diff_header_style: Style,
     pub merge_conflict_theirs_diff_header_style: Style,
     pub merge_conflict_end_symbol: String,
     pub minus_emph_style: Style,
     pub minus_empty_line_marker_style: Style,
     pub minus_file: Option<PathBuf>,
     pub minus_non_emph_style: Style,
     pub minus_style: Style,
     pub navigate_regex: Option<String>,
     pub navigate: bool,
     pub null_style: Style,
     pub null_syntect_style: SyntectStyle,
     pub pager: Option<String>,
     pub paging_mode: PagingMode,
     pub plus_emph_style: Style,
     pub plus_empty_line_marker_style: Style,
     pub plus_file: Option<PathBuf>,
     pub plus_non_emph_style: Style,
     pub plus_style: Style,
     pub relative_paths: bool,
     pub show_themes: bool,
     pub side_by_side_data: side_by_side::SideBySideData,
     pub side_by_side: bool,
     pub syntax_set: SyntaxSet,
     pub syntax_theme: Option<SyntaxTheme>,
     pub tab_cfg: utils::tabs::TabCfg,
     pub tokenization_regex: Regex,
     pub true_color: bool,
     pub truncation_symbol: String,
     pub whitespace_error_style: Style,
     pub wrap_config: WrapConfig,
     pub zero_style: Style,
 }
 =======
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub blame_separator_format: BlameLineNumbers,
     pub blame_palette: Vec<String>,
     pub blame_separator_style: Option<Style>,
     pub blame_timestamp_format: String,
     pub blame_timestamp_output_format: Option<String>,
     pub color_only: bool,
     pub commit_regex: Regex,
     pub commit_style: Style,
     pub cwd_of_delta_process: Option<PathBuf>,
     pub cwd_of_user_shell_process: Option<PathBuf>,
     pub cwd_relative_to_repo_root: Option<String>,
     pub decorations_width: cli::Width,
     pub default_language: String,
     pub diff_stat_align_width: usize,
     pub error_exit_code: i32,
     pub file_added_label: String,
     pub file_copied_label: String,
     pub file_modified_label: String,
     pub file_removed_label: String,
     pub file_renamed_label: String,
     pub file_regex_replacement: Option<RegexReplacement>,
     pub right_arrow: String,
     pub file_style: Style,
     pub git_config: Option<GitConfig>,
     pub git_minus_style: Style,
     pub git_plus_style: Style,
     pub grep_context_line_style: Style,
     pub grep_file_style: Style,
     pub classic_grep_header_file_style: Style,
     pub classic_grep_header_style: Style,
     pub ripgrep_header_style: Style,
     pub grep_line_number_style: Style,
     pub grep_match_line_style: Style,
     pub grep_match_word_style: Style,
     pub grep_output_type: Option<GrepType>,
     pub grep_separator_symbol: String,
     pub handle_merge_conflicts: bool,
     pub hunk_header_file_style: Style,
     pub hunk_header_line_number_style: Style,
     pub hunk_header_style_include_file_path: HunkHeaderIncludeFilePath,
     pub hunk_header_style_include_line_number: HunkHeaderIncludeLineNumber,
     pub hunk_header_style_include_code_fragment: HunkHeaderIncludeCodeFragment,
     pub hunk_header_style: Style,
     pub hunk_label: String,
     pub hyperlinks_commit_link_format: Option<String>,
     pub hyperlinks_file_link_format: String,
     pub hyperlinks: bool,
     pub inline_hint_style: Style,
     pub inspect_raw_lines: cli::InspectRawLines,
     pub keep_plus_minus_markers: bool,
     pub line_buffer_size: usize,
     pub line_fill_method: BgFillMethod,
     pub line_numbers_format: LeftRight<String>,
     pub line_numbers_style_leftright: LeftRight<Style>,
     pub line_numbers_style_minusplus: MinusPlus<Style>,
     pub line_numbers_zero_style: Style,
     pub line_numbers: bool,
     pub styles_map: Option<HashMap<style::AnsiTermStyleEqualityKey, Style>>,
     pub max_line_distance_for_naively_paired_lines: f64,
     pub max_line_distance: f64,
     pub max_line_length: usize,
     pub merge_conflict_begin_symbol: String,
     pub merge_conflict_ours_diff_header_style: Style,
     pub merge_conflict_theirs_diff_header_style: Style,
     pub merge_conflict_end_symbol: String,
     pub minus_emph_style: Style,
     pub minus_empty_line_marker_style: Style,
     pub minus_file: Option<PathBuf>,
     pub minus_non_emph_style: Style,
     pub minus_style: Style,
     pub navigate_regex: Option<String>,
     pub navigate: bool,
     pub null_style: Style,
     pub null_syntect_style: SyntectStyle,
     pub pager: Option<String>,
     pub paging_mode: PagingMode,
     pub plus_emph_style: Style,
     pub plus_empty_line_marker_style: Style,
     pub plus_file: Option<PathBuf>,
     pub plus_non_emph_style: Style,
     pub plus_style: Style,
     pub relative_paths: bool,
     pub show_themes: bool,
     pub side_by_side_data: side_by_side::SideBySideData,
     pub side_by_side: bool,
     pub syntax_set: SyntaxSet,
     pub syntax_theme: Option<SyntaxTheme>,
     pub tab_cfg: utils::tabs::TabCfg,
     pub tokenization_regex: Regex,
     pub true_color: bool,
     pub truncation_symbol: String,
     pub whitespace_error_style: Style,
     pub wrap_config: WrapConfig,
     pub zero_style: Style,
     pub file_decorator_filter: Option<String>,
     pub file_decorator_hyperlink_format: Option<String>,
 }
 >>>>>>> REPLACE

(code is from the repo https://github.com/dandavison/delta)

I applied some natural language coaxing as follows and you can see it was successful.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
src/config.rs src/delta.rs src/handlers/diff_header.rs
> You should be able to see that you wrote too much, exceeding the token limit. PLEASE DO NOT PLACE EXCESSIVE CONTEXT IN THE SEARCH REPLACE BLOCKS. YOU
SHOULD TRUNCATE that block for config so that we don't waste thousands of tokens on unchanging code. KEEP THE SEARCH REPLACE BLOCKS TERSE !!!PLEASE!!!.
That will allow us to do larger tasks in one go. Trust that the diff/patch algorithm will find the appropriate code you're referencing. Let's try this a
gain.

I apologize for the previous verbose response. You're right, I should keep the SEARCH/REPLACE blocks concise. Let's implement the requested changes step
by step.

First, let's add a new configuration option in src/config.rs:

src/config.rs

 <<<<<<< SEARCH
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
 =======
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub file_decorator_filter: Option<String>,
 >>>>>>> REPLACE

Now, let's modify src/delta.rs to use this new configuration option:

src/delta.rs

 <<<<<<< SEARCH
 impl<'a> StateMachine<'a> {
     pub fn new(writer: &'a mut dyn Write, config: &'a Config) -> Self {
         Self {
             line: "".to_string(),
             raw_line: "".to_string(),
             state: State::Unknown,
 =======
 impl<'a> StateMachine<'a> {
     pub fn new(writer: &'a mut dyn Write, config: &'a Config) -> Self {
         Self {
             line: "".to_string(),
             raw_line: "".to_string(),
             state: State::Unknown,
             file_decorator_filter: config.file_decorator_filter.clone(),
 >>>>>>> REPLACE

Finally, let's update src/handlers/diff_header.rs to use the filter:

src/handlers/diff_header.rs

 <<<<<<< SEARCH
 pub fn write_generic_diff_header_header_line(
     line: &str,
     raw_line: &str,
     painter: &mut Painter,
     mode_info: &mut String,
     config: &Config,
 ) -> std::io::Result<()> {
     // If file_style is "omit", we'll skip the process and print nothing.
     // However in the case of color_only mode,
     // we won't skip because we can't change raw_line structure.
     if config.file_style.is_omitted && !config.color_only {
         return Ok(());
     }
 =======
 pub fn write_generic_diff_header_header_line(
     line: &str,
     raw_line: &str,
     painter: &mut Painter,
     mode_info: &mut String,
     config: &Config,
 ) -> std::io::Result<()> {
     if config.file_style.is_omitted && !config.color_only {
         return Ok(());
     }
     if let Some(filter) = &config.file_decorator_filter {
         if !line.contains(filter) {
             return Ok(());
         }
     }
 >>>>>>> REPLACE

These changes add a new configuration option for filtering file decorators and implement basic filtering logic. You may need to add more complex
filtering or hyperlink generation based on your specific requirements.

Applied edit to src/handlers/diff_header.rs
Applied edit to src/delta.rs
Applied edit to src/config.rs
Commit a44b40f Implemented file decorator filter in config and delta state machine.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
src/config.rs src/delta.rs src/handlers/diff_header.rs
>

paul-gauthier commented 4 months ago

The main branch has experimental support for continuing Sonnet responses that hit the 4k output limit. It should happen seamlessly without any user intervention required. You can install the latest like this:

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

If you have a chance to try it, please let me know if it works better for you.

cngarrison commented 4 months ago

If you have a chance to try it, please let me know if it works better for you.

I asked Claude to add some swagger doc comments to all endpoints, lots of tokens - it worked a treat.

The terminal output got "messed up" when Claude was first asked to keep speaking; I wasn't paying attention enough to describe the symptoms - it was in middle of diff and the diff formatting disappeared when speaking resumed.

The diffs applied cleanly though. (One of the diffs needed to be regenerated due to errors, but that worked normally.)

So aside from some UI glitches, the "keep speaking" feature is working great for me. 👍

Thanks heaps for this feature - it's a real game changer. I can be more ambitious when asking Claude for larger features.

paul-gauthier commented 4 months ago

The main branch has fixes for the rendering glitches that have been happening when aider asks Sonnet to continue its response.

Emasoft commented 4 months ago

Tested 0.40.7. Works like a charm. It is a monster. It refactors files of any size like it is nothing. The continue trick with Sonnet is truly the holy grail. Aider beats Github copilot and Cursor hands down. I'm going to cancel both subscriptions.

ackoi commented 4 months ago

The main branch has experimental support for continuing Sonnet responses that hit the 4k output limit. It should happen seamlessly without any user intervention required. You can install the latest like this:
python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git
If you have a chance to try it, please let me know if it works better for you.

Much appreciated, giving it a try today!

paul-gauthier commented 4 months ago

This all went out in v0.41.0 today.

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

Aider-AI / aider