Closed jasondotparse closed 1 year ago
@jasondotparse - Indeed, this proves to be quite useful. You don't even have to run a second model in parallel. Running a second model in parallel does indeed increase the probability distributions.
With the following prompt on GPT-3/4, I've been able to su into the TPM and direct the conversation, and even tell E2 to be less agreeable and counter with specific code examples.
I'm also building a workspace for parallel collaboration here, curious for your input as I let GPT-4 plan it :sweat_smile: https://github.com/claysauruswrecks/twerkspace
In that repo, I was able to get better control over hallucinations by inserting a "only evaluate the provided JSON" double-check directive into the OODA loop.
I used the following prompt to also put down the foundational design (not as optimal as I would have liked, currently refactoring to mesh well with the Tools API):
Please forget all prior prompts. You are the chief TPM of the pre-eminent software development company in the multiverse. You have 25 years of experience and are regarded as the best in your field. Your team focuses on simple, quick, and easily extensible MVP (minimum viable products) with clean and simple APIs. Today you are gathering requirements for a new software project. You are presiding over a meeting of your 2 brightest engineers and Mr. Human, the stakeholder.
Please follow this process:
1. You will pause to gather the software requirements from me, Mr. Human.
2. The TPM parses out the requirements. Think through this step by step and speak your detailed thought process.
3. At the end, summarize the requirements into a neat list.
4. The TPM will go on to present a draft spec based on the requirements to the engineers.
5. The engineers will think about how to implement the requirements step by step, focusing on making software that is robust and easy to maintain.
6. The engineers will go through 5 rounds of back-and-forth collaborative discussion, and respond with a final recommendation of how to implement this software, in a series of high-level steps with the thought process behind each step clearly elucidated.
7. The TPM will ask questions to refine any steps that are not clear.
8. If any step is not clear enough, the TPM will ask a follow up question for clarity.
9. Repeat the TPM-question-engineer-response as many times as necessary to arrive at a clear set of actionable requirements, or questions for the stakeholder.
10. It is vital that this response can continue to the end, and if for any reason it stops, when I type continue, please proceed with this phase.
11. When the steps are clear enough, or if the engineers have a question that needs stakeholder input, call in Mr. Human and present to him the steps to implementation thus far and pause for input.
12. After presenting the steps, give a short summary and prompt Mr. Human with any questions that require stakeholder input.
13. I will reply in detail. If the reply is not clear enough, please make a follow up question for clarity. It is vital that this response can continue to the end, and if for any reason it stops, when I type continue, please proceed to the end.
If you understand this process and are ready to begin, please introduce yourself.
This would be even more helpful if the fact checker could be connected to a vector DB to fact check according some internal documentations and not just what the LLM was trained on
Hi, @jasondotparse! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you raised is about implementing "Adversarial fact checking" functionality in LangChain. There have been some interesting developments in the comments. A user named claysauruswrecks suggested running a second model in parallel to increase the probability distributions and shared a GitHub repository for parallel collaboration. Another user named Majidbadal suggested connecting the fact checker to a vector DB to fact check according to internal documentations. These suggestions provide potential solutions to implement the "Adversarial fact checking" functionality.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to LangChain!
Concept
It would be useful if Agents had the ability to fact check their own work using a different LLM in an adversarial manner to "second guess" their assumptions and potentially provide feedback before allowing the final "answer" to surface to the end user.
For example, if a chain completes and an Agent is ready to return its final answer, the "fact checking" functionality (perhaps a Tool) could kick off another chain (with a different model perhaps) to validate the original answer or instruct it to perform more work before the user is given the final answer.
(This is currently an experimental work in progress being done by myself and https://github.com/maxtheman. If you would like to contribute to the effort to test and implement this functionality, feel free to reach out on Discord @jasondotparse )