When using a custom bot, it implies that the maximum context window is reduced to 2048 because it's fully managed by Cohere.
Why don't use Claude3 to reduce the input question to 2048 token maximum to avoid having errors on client side when the user set a large input. And then send the reduced question to Cohere.
I've already discussed this with an AWS SA in France who agreed with this.
Why the solution needed
Avoid having errors when using the RAG for users with large context windows
Additional context
Implementation feasibility
Are you willing to discuss the solution with us, decide on the approach, and assist with the implementation?
Describe the solution you'd like
When using a custom bot, it implies that the maximum context window is reduced to 2048 because it's fully managed by Cohere. Why don't use Claude3 to reduce the input question to 2048 token maximum to avoid having errors on client side when the user set a large input. And then send the reduced question to Cohere.
I've already discussed this with an AWS SA in France who agreed with this.
Why the solution needed
Avoid having errors when using the RAG for users with large context windows
Additional context
Implementation feasibility
Are you willing to discuss the solution with us, decide on the approach, and assist with the implementation?