ubiquity / ubiquibot

Putting the 'A' in 'DAO'
https://github.com/marketplace/ubiquibot
MIT License
17 stars 61 forks source link

Pull Request Agent #690

Open 0x4007 opened 1 year ago

0x4007 commented 1 year ago

https://github.com/Codium-ai/pr-agent

Here's a copy in case the repo goes down for some reason: pr-agent-main.zip

kamaalsultan commented 1 year ago

Perhaps we can use this code to get ChatGPT to look at the issue specification, then look at the pull request, and ask if it is likely to solve it the specification.

I am not sure if ChatGPT will solve this successfully.

I'm unsure if we can incorporate but at least we can study it, and possibly wrap some of its capabilities by including it as a submodule.

I think we can add some features like describing or reviewing pr from https://github.com/Codium-ai/pr-agent I think we can also make the bot estimates timeline automatically.

0x4007 commented 1 year ago

I think we can also make the bot estimates timeline automatically.

I put some thought into this but I think we will require a vector database and embeddings of every issue in specific repositories and do some machine learning to improve the bot's prediction capabilities. Fortunately OpenAI has models that can create embeddings, like Ada.

kamaalsultan commented 1 year ago

I put some thought into this but I think we will require a vector database and embeddings of every issue in specific repositories and do some machine learning to improve the bot's prediction capabilities. Fortunately OpenAI has models that can create embeddings, like Ada.

Since we didn't use embedding for measuring similarity, I don't feel the necessity of embedding for est time line. But you are planning to use embedding and vector store, we can make embeddings of every issues and increase accuracy in measuring similarity and find it useful for other purposes. If you are concerning machine learning for time estimating, I guess we can gather labelled training data from GitHub open source repositories .

kamaalsultan commented 1 year ago

I think it is not proper to apply machine learning to raw labelled data (original issue text and the time took to solve the issue) since I can't find direct relationship between the timeline and issue text itself. I think it would be better to extract features from issues (if the issue is bug or new feature and estimated steps to solve the issue and so on...) and apply machine learning to calculate est time from the features.

Keyrxng commented 1 year ago

If this is up for grabs I'm interested in jumping on it.

My intention is to use SuperAGI and use the pr_agent as a toolkit that our agent will use but we can easily introduce other constraints, goals and actions. At least for prototyping this would be the quickest way I expect.

SuperAGI already has a bunch of toolkit integrations, webscraping, email, github etc and creating custom ones is relatively easy, I should be able to extract what we'd need from the pr_agent and it can easily be built atop or extended.

What's the consensus on SuperAGI?

0x4007 commented 1 year ago

If this is up for grabs I'm interested in jumping on it.

This isn't priced because it is a draft bounty. I think that

Perhaps we can use this code to get ChatGPT to look at the issue specification, then look at the pull request, and ask if it is likely to solve it the specification.

is an interesting deliverable but probably needs to be broken down more which I can't do at the moment.

I'm not keen on adding new dependencies but I'll need to research your recommendation at another time and let you know.


Update: SuperAGI seems interesting. I don't have the chance to look through their code but it could make sense as a foundation for our AI powered features.