Closed rguan1 closed 1 year ago
Some questions related to this diff:
core
folder. However, I don't think that is the appropriate place for the openai abstract agent.The reason I want to create an abstract class is so that I can house shared logic such as shared arg parameters and additional helper functions (like token counting and prompt truncation).
Should the name of this agent remain "openai-chat-completions" or should we go with a snappier but more inaccurate gpt4 name? I settled on "openai-chat-completions" to future-proof the code against future model names using the endpoint and because it is a more accurate reflection of the dual duty of querying chatgpt as well as gpt4.
I have this argument counterparty-name and counterparty-role. This works fine unless you use self chat with both conversation partners being chat-completion agents. In this case, the turn history gets confused because both agent-1 and agent-2 believe themselves to be name
rather than counterparty-name
. In other words, both agents think they are the first speaker. Ideally, we'd want agent-2 to recognize that it is counterparty-name
.
Do you have a good way to handle this? The only thing that I can think of is parsing the agent id for an int because self-chat with the same model results in an int appended to the end of the agent id like this:
[Gpt3Agent_1]: Hi there! How can I help?
[Gpt3Agent_2]: I'd like to make a pull request
However, this solution is imperfect since I don't believe it handles n-party convos when n >= 3.
This looks great, thanks for your contribution. To address your questions:
parlai/core/api_agent
) which will be used with any similar models that will come out in the future. You can modify FakeHistory
and add it as something more compatible there as well.@mojtaba-komeili Thanks for reviewing this PR!
Ok, I addressed the smaller nits and suggestions that you highlighted so far. The individual commits in the PR should make it clear what was addressed.
I also made a a first pass at the OpenAI base class: it only abstracts out shared CLI params. Let me know what you think so far. I can add an improved fake history if this is the right approach. Also, should we split up the base API changes into a separate PR? I'm not super well versed in working in PRs, so I am not sure at what point there is too much scope creep within an individual PR. However, I can imagine that tacking this feature onto this PR could make it harder to read and understand the git history rather than having 2 PRs. Let me know if it is ok to keep all of the code in this PR or if it should be split up.
Thanks a lot. This looks great.
Thanks again for your contribution.
Thanks for the suggestions. I think we can definitely get a core history class in, and this time I will give it a better name than FakeHistory haha.
So, if I were to split this PR into 2, this is my proposed split
I think if we split it that way, we can just chronologically split the later commits into a separate PR. Let me know what you think @mojtaba-komeili. Thanks again!
Thanks for the suggestions. I think we can definitely get a core history class in, and this time I will give it a better name than FakeHistory haha.
So, if I were to split this PR into 2, this is my proposed split
- First PR adds the Chat Completion Agent
- Second PR introduces a base openai agent, a base history class, and usage of the base history class (which can handle the turns logic as proposed above).
I think if we split it that way, we can just chronologically split the later commits into a separate PR. Let me know what you think @mojtaba-komeili. Thanks again!
Sorry for late reply. Yeah, that sounds like a good plan to me. The first PR is making a practical change to be used immediately and the second makes the basis for future similar ones. Looking forward to seeing your submission.
The latest change represents the "First PR adds the Chat Completion Agent" as discussed above. In other words, we removed the the 2 latest commits which do the following "base openai agent, a base history class, and usage of the base history class". We will add those in a follow-up PR to keep the commit history cleaner and more readable.
Sorry for the late response, @mojtaba-komeili! Please let me know if anything else needs to be changed before it is good to go.
Patch description We are adding support for the latest OpenAI models like GPT-3.5 (ChatGPT) and GPT-4 by adding a wrapper for the chat completion API.
Testing steps We've manually tested that the wrapper can handle a multi-turn conversation without error-ing which demonstrates that the wrapper is properly storing and querying the API endpoint. We tested conversation capability both with GPT-3.5 and GPT-4 models in both
self-chat
andinteractive
modes.Self Chat
Interactive Mode