x3n0cr4735 / celi

CELI: A Framework for Controller-Embedded Language Interactions
MIT License
13 stars 5 forks source link

Sections prematurely terminated from repeated messages #74

Closed jon-holman closed 4 months ago

jon-holman commented 4 months ago

I have a few circumstances where CELI is identifying repeated messages and terminating a section prematurely. Message I'm getting:

celi_framework.core.section_processor WARNING - Identified a loop in 1. Identical messages are repeating. pop_context and moving on to the next section.

For some context, my project is to draft a document, and for one of the sections of that document I want to update a section based on information from one new source document at a time. So in my job description, I have a short series of tasks I loop through until this process has been done for all source documents. I've included a counter of the total iterations required, current iteration, and remaining iterations to ensure progress is being made. The loop length is well-defined, so progress is relatively easy to track.

1) Sometimes I get a blank message return, after which CELI reorients itself and continues with the tasks as instructed. This isn't a problem itself, but if it happens 3 times across the entire section I'll get the mentioned message and the section will be terminated. This is by far the most common case where my loop gets broken. *One fix for this might be to look at the last n (I believe 3 based on the current setup) messages instead of through the entire message history

2) I have some tasks as part of the loop that are very simple, for example doing a simple check or calling a function. The tasks are simple enough that sometimes the output message is exactly the same. This can also trigger the loop identification logic and terminate the section. *A possibility to fix this may be to look for repeated consecutive sequences of messages. That could still capture unwanted loops while leaving loops with a counter or some other progress metric to run.

DaveDeCaprio commented 4 months ago

This makes sense. I just did a PR that will only check the recent messages for duplicates. That should address this. We’ll get a new version out hopefully this weekend with the fixes.

From: jon-holman @.> Sent: Friday, June 21, 2024 3:01 PM To: x3n0cr4735/celi @.> Cc: Subscribed @.***> Subject: [x3n0cr4735/celi] Sections prematurely terminated from repeated messages (Issue #74)

I have a few circumstances where CELI is identifying repeated messages and terminating a section prematurely. Message I'm getting:

celi_framework.core.section_processor WARNING - Identified a loop in 1. Identical messages are repeating. pop_context and moving on to the next section.

For some context, my project is to draft a document, and for one of the sections of that document I want to update a section based on information from one new source document at a time. So in my job description, I have a short series of tasks I loop through until this process has been done for all source documents. I've included a counter of the total iterations required, current iteration, and remaining iterations to ensure progress is being made. The loop length is well-defined, so progress is relatively easy to track.

  1. Sometimes I get a blank message return, after which CELI reorients itself and continues with the tasks as instructed. This isn't a problem itself, but if it happens 3 times across the entire section I'll get the mentioned message and the section will be terminated. This is by far the most common case where my loop gets broken. *One fix for this might be to look at the last n (I believe 3 based on the current setup) messages instead of through the entire message history
  2. I have some tasks as part of the loop that are very simple, for example doing a simple check or calling a function. The tasks are simple enough that sometimes the output message is exactly the same. This can also trigger the loop identification logic and terminate the section. *A possibility to fix this may be to look for repeated consecutive sequences of messages. That could still capture unwanted loops while leaving loops with a counter or some other progress metric to run.

— Reply to this email directly, view it on GitHub https://github.com/x3n0cr4735/celi/issues/74 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNLOXNQFDCXDNGTICBMC3ZISA6FAVCNFSM6AAAAABJWVNC5GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DOMJWHE3DAMQ . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/AAGNLOQ7G2N5QVTD6VFHZSTZISA6FA5CNFSM6AAAAABJWVNC5GWGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHI2GBEII.gif Message ID: @. @.> >

jon-holman commented 4 months ago

Two pain points I'm still experiencing here:

DaveDeCaprio commented 4 months ago

Hi Jon, I just pushed a duplicates-2 branch that improves the duplicate handling a bit. I raised the limit to 5 duplicates in a row and added a second check explicitly for alternating duplicates.

On the improper function calls, I think the explicit check for alternating duplicates should fix that as the function calls will have different parameters.

Let me know if those fixes work for you. Also, do you need a pip release for this or do you work directly off the source?

Dave

From: jon-holman @.> Sent: Tuesday, June 25, 2024 8:54 AM To: x3n0cr4735/celi @.> Cc: David DeCaprio @.>; State change @.> Subject: Re: [x3n0cr4735/celi] Sections prematurely terminated from repeated messages (Issue #74)

Two pain points I'm still experiencing here:

— Reply to this email directly, view it on GitHub https://github.com/x3n0cr4735/celi/issues/74#issuecomment-2189025539 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNLORR5UPNMDJZS6JVW43ZJFY5VAVCNFSM6AAAAABJWVNC5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBZGAZDKNJTHE . You are receiving this because you modified the open/close state. https://github.com/notifications/beacon/AAGNLOTIWY6WW4D4BNQXXMDZJFY5VA5CNFSM6AAAAABJWVNC5GWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUCPHQQG.gif Message ID: @. @.> >

jon-holman commented 4 months ago

Great, thanks! I agree this should solve both issues. I'm working directly off of source so I'll work off of that branch for now.

Should the loop be checking previous_message instead of last_message to increment the alternating duplicates since last_message is being checked for regular duplicates? Right now I think we're just doing a similar count of duplicates of the most recent message rather than the second to last message.

So the sequence ['A', 'A', 'A', 'A', 'A', 'B', 'A'] will pass the first check (4 A's and 1 B in the 5 messages before the most recent message) but fail the second (4 A's in the 5 messages before the most recent message and 5 A's in the 5 messages before the 2nd most recent message).

def check_for_duplicates(
        self, ongoing_chat: List[Dict[str, str] | Tuple[str, str]]
    ):
        if len(ongoing_chat) == 0:
            return False
        last_message = ongoing_chat[-1]
        duplicates = 0
        for message in ongoing_chat[-6:-1]:
            if message == last_message:
                duplicates += 1
        if duplicates >= 5:
            return True
        # Check for cycles of 2 repeating messages
        if len(ongoing_chat) > 2:
            previous_message = ongoing_chat[-2]
            alternating_duplicates = 0
            if previous_message != last_message:
                for message in ongoing_chat[-7:-2]:
                    if message == **last_message**: #Should this be comparing to previous_message i.e. the 2nd most recent message?
                        alternating_duplicates += 1
                    if duplicates >= 3 and alternating_duplicates >= 3:
                        return True
        return False
jon-holman commented 4 months ago

As I look at it again, even with the change I suggested a true alternating sequence won't be handled properly. So consider ['A', 'B', 'A', 'B', 'A', 'B', 'A'].

Our initial check for duplicates will find only 2 more A's, so that will pass. But then if we look for 3 B's in [-7:-2] we will only find 2 and that will pass as well. (If we leave the code as it is we will still pass since duplicates will be equal to 2 although alternating_duplicates will be 3.)

I think the fix here could be to either:

One other comment: a lot of my "repeated" messages are actually blank assistant messages while it debugs a function call. Whatever comes in that function call section isn't part of the duplicate check. I think that output can be informative for calling out that we're not in a true loop, so if we could merge the assistant and function outputs for the purposes of checking duplicates that could do a lot of the filtering for us. For example, here's something I get commonly:

Assistant:

Function: Call to function foo with arguments returned Error: Unable to parse arguments Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Which might be followed by

Assistant:

Function: Call to function foo with arguments {'bar': '1'} returned baz

Still can be considered as 2 sequential duplicate messages although I would argue that's not intended.

DaveDeCaprio commented 4 months ago

The function messages will get included in the messages history that is getting checked.

This just needs some good unit tests. I’ll put those in and get it working correctly this morning.

Dave

From: jon-holman @.> Sent: Thursday, June 27, 2024 1:49 PM To: x3n0cr4735/celi @.> Cc: David DeCaprio @.>; State change @.> Subject: Re: [x3n0cr4735/celi] Sections prematurely terminated from repeated messages (Issue #74)

As I look at it again, even with the change I suggested a true alternating sequence won't be handled properly. So consider ['A', 'B', 'A', 'B', 'A', 'B', 'A'].

Our initial check for duplicates will find only 2 more A's, so that will pass. But then if we look for 3 B's in [-7:-2] we will only find 2 and that will pass as well. (If we leave the code as it is we will still pass since duplicates will be equal to 2 although alternating_duplicates will be 3.)

I think the fix here could be to either:

One other comment: a lot of my "repeated" messages are actually blank assistant messages while it debugs a function call. Whatever comes in that function call section isn't part of the duplicate check. I think that output can be informative for calling out that we're not in a true loop, so if we could merge the assistant and function outputs for the purposes of checking duplicates that could do a lot of the filtering for us. For example, here's something I get commonly:

Assistant:

Function: Call to function foo with arguments returned Error: Unable to parse arguments Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Which might be followed by

Assistant:

Function: Call to function foo with arguments {'bar': '1'} returned baz

Still can be considered as 2 sequential duplicate messages although I would argue that's not intended.

— Reply to this email directly, view it on GitHub https://github.com/x3n0cr4735/celi/issues/74#issuecomment-2195455350 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNLOWZRMNP47XV77JMLT3ZJRNCFAVCNFSM6AAAAABJWVNC5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJVGQ2TKMZVGA . You are receiving this because you modified the open/close state. https://github.com/notifications/beacon/AAGNLORYQTCDDX5JZBJM2HLZJRNCFA5CNFSM6AAAAABJWVNC5GWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUC3P6XM.gif Message ID: @. @.> >

DaveDeCaprio commented 4 months ago

I added some tests and updated the logic. I changed the alternate to look for 3 total copies (which is 2 duplicates). I think that should be sufficient. Let me know if it isn't.