AbanteAI / mentat

Mentat - The AI Coding Assistant
https://mentat.ai
Apache License 2.0
2.42k stars 226 forks source link

Anthropic message normalization #537

Closed jakethekoenig closed 4 months ago

jakethekoenig commented 4 months ago

System messages are converted to annotated user messages and repeated messages are concatenated.

This is sort of an ugly PR. I'd rather convert system messages to user messages natively and then have no difference between how gpt/claude are handled. I think a better philosophy than using system messages to inject information from the environment would be to tell the llm its talking to a bot that is relaying information from the end user and the environment and then have all system messages besides the parser prompt be user messages annotated with "Code message" and annotate the actual user messages as "End user message:".

But I kind of want to get this out quickly because it seems clear claude 3 is better than gpt-4 and I want to make it possible for other people to use it.

Pull Request Checklist

mentatbot[bot] commented 4 months ago

MENTAT CODE REVIEW IN ACTIVE DEVELOPMENT. Only in use on mentat and internal repos. Please Reply with feedback.

The pull request introduces important functionality for supporting alternative models, specifically Anthropic models, which is a valuable addition. However, there are areas where the documentation could be more explicit or detailed to help users better understand how to use these models with Mentat. Additionally, the implementation makes assumptions about model names that could potentially lead to unintended behavior. It's also recommended to consider the configurability of hardcoded values to accommodate a wider range of use cases.