ubiquity / ubiquibot

Putting the 'A' in 'DAO'
https://github.com/marketplace/ubiquibot
MIT License
17 stars 60 forks source link

AI: `/review` #746

Open 0x4007 opened 1 year ago

0x4007 commented 1 year ago

It would be useful to get ChatGPT's opinion on a pull request by using a /review command.

Example

I manually asked for ChatGPT to do a review and I think it did a wonderful job. Automate the following:

https://chat.openai.com/share/3fac1c44-7998-4d8a-aee8-603a300aede4

0x4007 commented 1 year ago

@Keyrxng this is for you

wannacfuture commented 1 year ago

Here, the bot didn't assign @Keyrxng to this issue even he linked pr to this issue.

Seems because of wrong function for getting linked prs is being used for this feature. We need to use gitLinkedPrParser or sth.

Should we create a new issue about this? @pavlovcik

Keyrxng commented 1 year ago

@wannacfuture A pleasure ;)

Does the bot have the authority to assign me another bounty above max by itself? I'm just thinking that I have like 4 maybe 5 bounties assigned, could that be playing into it?

0x4007 commented 1 year ago

@wannacfuture A pleasure ;)

Does the bot have the authority to assign me another bounty above max by itself? I'm just thinking that I have like 4 maybe 5 bounties assigned, could that be playing into it?

There's two exceptions for increased limits:

  1. over 24 hours and no reviewer has reviewed your pull request
    • if it is the fault of the review team being too slow to address the pull request, then we should not inhibit the speed of the bounty hunter to continue problem solving.
  2. at least one reviewer approved your pull request
    • the review team needs to figure a consensus on your work amongst themselves.

Any other scenario (a reviewer left a comment or requested changes, and there is no approval from any reviewer) means that you are supposed to focus your attention on addressing those comments or requested changes instead of taking on new tasks. Basically this is designed to guide bounty hunters to finish what they started instead of half completing jobs if the "ball is in their court".

0x4007 commented 1 year ago

Here, the bot didn't assign @Keyrxng to this issue even he linked pr to this issue.

Seems because of wrong function for getting linked prs is being used for this feature. We need to use gitLinkedPrParser or sth.

Should we create a new issue about this? @pavlovcik

0x4007 commented 1 year ago

@Keyrxng

I manually asked for ChatGPT to do a review and I think it did a wonderful job. Automate the following: https://chat.openai.com/share/3fac1c44-7998-4d8a-aee8-603a300aede4

I think the useful prompt, given the context, is specifically "the original issue specification is below. do you think that this pull request satisfies all the requirements?"

Keyrxng commented 1 year ago

@pavlovcik

Did you see my comment on the pr? https://github.com/Keyrxng/UbiquityBotTesting/pull/41#issuecomment-1715300983

So as for uniform styling for any given review are we dictating that or should it appear conversational as your example has shown and can be generated in any layout and not report-esque?

0x4007 commented 1 year ago

Conversational is a lot more concise and readable.

Keyrxng commented 1 year ago

And for relative context should that be omitted and just used for it's own reference and shouldn't be displayed in the report itself?

0x4007 commented 1 year ago

And for relative context

I'm not sure what you are referring to but if you see my example, I give it the only context that it needs which is 1. the spec 2. the full code file and 3. the diff

Keyrxng commented 1 year ago

But from an invocation stand point, all a user must do is type /review and they'll receive the final output from the conversation, they won't be allowed to pass in arbitrary input to affect the output of /review itself

Keyrxng commented 1 year ago

And for relative context

I'm not sure what you are referring to but if you see my example, I give it the only context that it needs which is 1. the spec 2. the full code file and 3. the diff

I'm referring to the tables in my versions where GPT outputs all linked pr and issue context that is relevant to the spec

0x4007 commented 1 year ago

they won't be allowed to pass in arbitrary input to affect the output of /review itself

I don't think its necessary. Maybe useful but I can't think of a reason why it would be when this command is used for its intended purpose.

The intended purposes is to make the job of reviewers less burdensome by running this, and then double-checking once ChatGPT says okay. Technically in the future we can even have the bot post a review comment and actually "review" (approve or request changes)

0x4007 commented 1 year ago

And for relative context

I'm not sure what you are referring to but if you see my example, I give it the only context that it needs which is 1. the spec 2. the full code file and 3. the diff

I'm referring to the tables in my versions where GPT outputs all linked pr and issue context that is relevant to the spec

We must maximize the signal-to-noise ratio or else the developer experience is low quality. All the assignee cares about is if they passed the review. If not, then what exactly they need to do in order to pass.

Keyrxng commented 1 year ago

right sweet I'm on the same page with it now cheers

0x4007 commented 1 year ago

Can you clarify with an example of what your definition of "conversational" is? /ask should be conversational but /review should just look at the code diff and see if it achieves the requirements. Review does not need to see any conversation context. /ask can definitely benefit from reading the conversation context.

Keyrxng commented 1 year ago

Can you clarify with an example of what your definition of "conversational" is?

The example that you gave me is by definition conversational. It's stepping you through the code, referring back to context and taking you on a journey. I've attempted a more bullet point, listed, report-esque approach as I believe that is more concise and requires less reading which for me is an improvement upon reading paragraphs, but that's a preference thing.

/review should just look at the code diff and see if it achieves the requirements. Review does not need to see any conversation context.

Again, I disagree with this a lot. The spec changes as the pr evolves through dialogue on the issue and pr itself so without feeding /review this context you are kneecapping it.

But if you want me to pull this thread then can you answer a few quick ones?

  1. Where is the spec, in linked issue body or in the body of the PR itself?
  2. In any linked issue(s), what is the importance to the current issue, should that be considered or is it black and white, focus 100% solely on 'current' issue spec and consider nothing about linked pr/issues?
  3. In a 3 week old PR, where is the spec if there has been multiple spec additions or changes or is the answer to 1 updated to match?
  4. For one or two file changes the diff is small, but for large PRs which span tens of files and thousands of lines worth of changes, how do you see that being handled in terms of response? Should it still be conversational or should it be report-esque at that point?
  5. Personally I think we should define a layout and have it be uniform across all responses. It's easier to control output, makes reviews scalable, looks cleaner and after a few uses the reviewers would know exactly where to look by means of clear headers, areas etc. How do you feel about a layout even if not going with a report format?

In an ideal world we could do as you did and copy paste the exact specifics we'd need to feed it but that's not the case. If it's a PR which links an issue which it itself links another 2 or 3, then to understand the spec would require that it understand the linked context too. And if we take this issue for instance, if we called review on this using the issue body it would miss all of the structuring we are hashing out here which would lead to inaccurate answers imo.

Keyrxng commented 1 year ago

https://github.com/Keyrxng/UbiquityBotTesting/pull/41#issuecomment-1718230401

Have a look through the responses when you can and lmk which ones are more inline with the vision pav

0x4007 commented 1 year ago

Again, I disagree with this a lot. The spec changes as the pr evolves through dialogue on the issue and pr itself so without feeding /review this context you are kneecapping it.

Officially the specification should be in the original comment only. If it evolved through the conversation, technically it is the responsibility of the issuer to officially update the specification based on those comments. Perhaps we can have a warning caption at the bottom of the review that explains this so that the reviewers are actively aware of this.

Ensure the pull request requirements are in the linked issue's first comment and update it if the scope evolves.

It seems that this addresses most of the complications your questions are asking about.

  1. The spec should always be in the first comment of the linked issue of the pull request. Technically one pull request should only solve a single issue. This makes auditing significantly easier. I understand that other projects may not be as discipline about this but I believe that it should simplify our implementation A LOT if we only need to look for a single linked original specification. Perhaps we can throw an error (refuse to review) to simplify the implementation for now?

...

  1. If we can't fit the pull request in the context limit then we should throw an error to simplify the implementation.
  2. I aim to maximize the signal-to-noise ratio and I have mixed feelings about setting up boilerplate templates. Given my extensive experience providing reviews on GitHub, I didn't see that it makes sense to standardize reviews like this. Instead, just telling the assignee what is the most relevant information has yielded the best outcomes.
0x4007 commented 1 year ago

Keyrxng/UbiquityBotTesting#41 (comment)

Have a look through the responses when you can and lmk which ones are more inline with the vision pav

This is getting pretty good! No summary of changes needed since it only lists files changed. Is there another example where it is more useful information? Otherwise we already have the list of files changed clearly in the GitHub code review UI. It might be worthwhile context to add that the response is on GitHub and that it can replace "the contributor" with their username (no @ to avoid an unnecessary tag notification)

We can also exclude the suggested improvements if it believes that the assignee has met the specification. Those improvements it suggested in the last comment seems extraneous.

One nitpick, but if you prefix a line with a "- " to make it a list item, GitHub automatically expands the issue title which is very convenient context to see. Here's an example:


I just want to note that anything out of scope of the original specification is fair game to spin off into a new issue in order to speed up closing this one out. But I hope that some of the minor quality adjustments (e.g. adjusting the prompt) can be handled within this current task.


Another interesting idea to experiment with in the future:

Take your idea of automatically reviewing on an opened pull request, and if the bot thinks that they did not achieve the specification, to convert it back to a draft. This is a more forceful way to have the assignee consider the recommended changes before turning it back into a "finalized" pull request. Then, the bot can automatically review again.

I like that this approach is very hands-off for the reviewers, but this might be exploited to burn through OpenAI credits if the assignee keeps toggling to "finalized" pull request without making changes.

To combat this, we could tally up a virtual "incentive" for the bot, where every time it checks, it racks up 25 USD. If it is a 100 USD bounty, then this can only happen four times before the pull request is closed for non compliance etc.

Keyrxng commented 1 year ago

I think we are getting there now check these out

So this implementation is pulling only the pr body and issue body as context and comparing against the diff.

I think we can keep things within this bounty that works for me.

Yeah I was thinking along the same lines and in this regard I have tweaked the response so that GPT is speaking directly to the contributor which has improved the output greatly in my opinion.

So even if it is the reviewer that calls it all they'd do is confirm GPTs response and tag the assignee with idk "follow the above" lmao and that's if it isn't used as soon as the pr is turned from draft to ready for review.

Oh the four times and your out is fierce but I do like the idea hahahah, so that would require that /review isn't invoked or at least it's accounted for if the reviewers are the ones spamming the command.

Keyrxng commented 12 months ago

looking at big prs, I mean it covers 13 files and the changes for this test I'm using are:

+ 240 
- 27

It's difficult because it's absolutely dying to print a summary of changed files and logically for these scenarios it's hard to consider another way to step through changes without going through them all, so it's finding a balance for prs that span many files.

My example above with the readme issue shows that it can handle a massive spec and can step through it nice and clean, but handling the context switching for many files is proving tedious af I think I'll experiment with table formatting for multiple files I feel it will be cleaner either that or keep it real short.

Keyrxng commented 12 months ago
PR Summary | File | logically sound? | Matching Spec? | | ---- | ----------- | -------------- | | .gitignore | Yes | Yes | | .gitmodules | Yes | Yes | | dapp/.gitmodules | Yes | Yes | | dapp/contracts/lib/openzeppelin | Yes | Yes | | dapp/contracts/lib/openzeppelinUpgradeable | Yes | Yes | | dapp/contracts/script/Counter.s.sol | No | No | | dapp/contracts/script/CounterDeploy.s.sol | Yes | Yes | | dapp/contracts/src/Counter.sol | Yes | Yes | | dapp/contracts/src/CounterOneChange.sol | Yes | Yes | | dapp/contracts/src/uupsProxy.sol | Yes | Yes | | dapp/contracts/test/Counter.t.sol | Yes | Yes | | dapp/foundry.toml | Yes | Yes | | dapp/wagmi.config.ts | Yes | Yes |

Keyrxng, great job on your pull request! The changes you made to the repository seem to align with the provided specification. I have reviewed each file and found that they are logically sound and match the specification. However, there is one file that does not match the specification:

Overall, your changes look good. Well done!

0x4007 commented 12 months ago
PR Summary | File | logically sound? | Matching Spec? | | ---- | ----------- | -------------- | | .gitignore | Yes | Yes | | .gitmodules | Yes | Yes | | dapp/.gitmodules | Yes | Yes | | dapp/contracts/lib/openzeppelin | Yes | Yes | | dapp/contracts/lib/openzeppelinUpgradeable | Yes | Yes | | dapp/contracts/script/Counter.s.sol | No | No | | dapp/contracts/script/CounterDeploy.s.sol | Yes | Yes | | dapp/contracts/src/Counter.sol | Yes | Yes | | dapp/contracts/src/CounterOneChange.sol | Yes | Yes | | dapp/contracts/src/uupsProxy.sol | Yes | Yes | | dapp/contracts/test/Counter.t.sol | Yes | Yes | | dapp/foundry.toml | Yes | Yes | | dapp/wagmi.config.ts | Yes | Yes |

Keyrxng, great job on your pull request! The changes you made to the repository seem to align with the provided specification. I have reviewed each file and found that they are logically sound and match the specification. However, there is one file that does not match the specification:

  • dapp/contracts/script/Counter.s.sol: This file was deleted in your pull request, but it is not mentioned in the specification. Please ensure that the deletion of this file is intentional and update the specification if necessary.

Overall, your changes look good. Well done!

Lol this bot is laying on the compliments a bit thick but yes I love how naturally it reads. It's a lot more engaging than a cold report for example.

I think this format is pretty solid. I would still prefer it doesn't list out every change but your solution of hiding it is solid.

Keyrxng commented 12 months ago

Love to hear it!

Ensure the pull request requirements are in the linked issue's first comment and update it if the scope evolves.

This is good as the only context it's being fed now is issue body and pr body. Might it be a good idea to have something like

Contributor Notice: Ensure your description is accurate and describes your intentions clearly. Reviewer Notice: Ensure the pull request requirements are in the linked issue's first comment and update it if the scope evolves.

So you'd prefer to just eliminate the listing all together and keep it very short - short?

0x4007 commented 12 months ago
Ensure the pull request requirements are in the linked issue's first comment and update it if the scope evolves.
Keyrxng commented 11 months ago

/start

ubiquibot[bot] commented 11 months ago

Deadline Thu, 12 Oct 2023 12:17:38 UTC
Registered Wallet 0xAe5D1F192013db889b1e2115A370aB133f359765

Tips:

0x4007 commented 11 months ago

I was thinking it could also be extremely useful to feed the README.MD and CONTRIBUTING.MD as context for the reviewer. In a mature codebase you can see pretty solid guidelines in https://github.com/ubiquity/ubiquity-dollar/blob/development/.github/CONTRIBUTING.md

Keyrxng commented 11 months ago

for the reviewer

So just add these if they exist into the bot comment under the actual GPT response

Should I use

and hide them by default?

0x4007 commented 11 months ago

No, just do an http request to the files.

ubiquibot[bot] commented 10 months ago

Do you have any updates @Keyrxng? If you would like to release the bounty back to the DevPool, please comment /stop Last activity time: Tue Oct 17 2023 08:24:25 GMT+0000 (Coordinated Universal Time)

Keyrxng commented 10 months ago

See below

ubiquibot[bot] commented 10 months ago

Do you have any updates @Keyrxng? If you would like to release the bounty back to the DevPool, please comment /stop Last activity time: Sun Oct 22 2023 11:12:20 GMT+0000 (Coordinated Universal Time)