Data results are included completely in the LLM messages, which easily overflows tokens.
After investigating, even without data results, the tokens build up to quite some size when cycling through the state machine. It makes sense to limit the messages sent to the LLM within the state machine itself (not just at a conversation history level. As in, within a single run, LLM calls do not need to see everything from the beginning of the state machine run).
Data results are included completely in the LLM messages, which easily overflows tokens.
After investigating, even without data results, the tokens build up to quite some size when cycling through the state machine. It makes sense to limit the messages sent to the LLM within the state machine itself (not just at a conversation history level. As in, within a single run, LLM calls do not need to see everything from the beginning of the state machine run).