Closed 0ptim closed 1 year ago
I just committed a implementation to gather all governance proposals. The implemented function just lists the proposal id and the proposal title. The unimportant part can be left out or can be queried by the get governance proposal
tool.
But there is a problem. The context window is still to small:
I can think of two solutions for this problem:
The implementation of solution two would make the application future-proof for even larger data volumes. But we can also assume that over time the context windows will become larger due to technical progress.
What do you think @0ptim?
Hey @eric-volz
I'll soon test with the new OpenAI model gpt-3.5-turbo-16k-0613
which has a 16k token window. Maybe this could solve our problems in the meantime.
But even then, we have to think about our approach in general.
What happens in the future when there are even more proposals in total which are added over time?
Also, it's not very nice to return huge outputs from the tools.
Either you or I would need to make a proof of concept.
Let's assume the user asks: What was the outcome of the proposal about this podcast called in the market?
podcast in the market
.Option 1: String comparison We just use all words from the tool input and perform a normal search over all proposals and just return the ones which are matched.
This method is nice because it's easy to do, is fast, and does not require an API call to OpenAI. But it could also happen a lot that the searched proposal is not found.
Option 2: Similarity search
We create Embeddings for the tool input and for all proposals. We compare do a normal similarity search over the vectors and only return the top n
results.
This method is more complex and slower, but could greatly improve the reliability on how good the tool is in finding the right proposals.
I suggest we try Option 1
first and see how it goes. If it works fine, we can leave it at that. If not, we'll explore Option 2
.
You can also explore Embeddings directly if you want to anyway, you can of course just go to Option 2
.
Let me know if you need help or if I should take over from here on if you don't have time.
PS: I created a draft PR #63 which points to vNext
. You'll need to merge vNext
into your branch.
Thank you for the detailed description.
I agree with you. Let's try Option 1 first. But this will only be possible in the future with a bigger context window.
Let's save all the extra features for later and finish the release with this for now.
Why is option 1 only possible with a bigger context window? Only the matching proposals will be returned to the llm. This should only be a few.
Yes, let's try to finish this one so the new version is completed.
Close the issue when PR is merged.
Why is option 1 only possible with a bigger context window? Only the matching proposals will be returned to the llm. This should only be a few.
Sorry, I misunderstood you. You are right.
Close the issue when PR is merged.
Sorry for closing the issue early! :(
Sorry, I misunderstood you. You are right.
Sure np!
Sorry for closing the issue early! :(
Np all good 😄
Add more tools which provide on-chain data via the Ocean API.
Implementing this using the DefichainPython library by @eric-volz.
Maybe something to consider: https://github.com/hwchase17/langchain/pull/5050