Open 0x4007 opened 1 week ago
/start
! Please set your wallet address with the /wallet command first and try again.
! No wallet address found
/wallet 0x3Ea855E4D6440D937117c776501e7653a770b759
+ Successfully registered wallet address
/start
Deadline | Wed, Oct 2, 9:41 PM UTC |
Beneficiary | 0x3Ea855E4D6440D937117c776501e7653a770b759 |
<ul>
<li>Use <code>/wallet 0x0000...0000</code> if you want to update your registered payment wallet address.</li>
<li>Be sure to open a draft pull request as soon as possible to communicate updates on your progress.</li>
<li>Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.</li>
<ul>
@Keyrxng should I implement this feature in a new file? and also what should happen if the pull is not initially opened as a draft?
@0x4007 I had gpt re-rewrite for brevity but it removed what I'd call context but this still makes sense.
Discussion:
Spec consideration > Embedding effectiveness > Bot intent & current model
https://chatgpt.com/share/66ec086f-7d70-8000-a914-991b77a819b0
Conclusion:
The current model (embedding-based) struggles to provide detailed, code-specific feedback. While embeddings are useful for high-level semantic comparisons, they lack the precision needed for effective PR reviews. You'll need additional tools to meet your goal.
Task Understanding:
yarn.lock
)This is very similar to /review
, and separately, /gpt
currently has the ability to process full PR diffs, spec, and conversation per query across multiple issues/repos. However, embeddings aren't ideal for this. Raw text analysis (task spec + diff) is more effective since we need precise feedback, not high-level similarity or semantic meaning.
The plugin's effectiveness must be high to avoid becoming annoying or useless. Therefore, it makes sense to re-spec or close this idea since embeddings aren’t suitable.
To be clear, the intention of this spec was to automate the initial review process and is feasible but better done without embeddings in my opinion. If the /review
approach is being replaced by this then I think the following workflow works best:
Suggested Workflow:
pull_request.created
(can run a review here or not and on whether it's opened as a draft or ready_for_review)pull_request.ready_for_review
.commented
status otherwise.Additionally the task is way overpriced, https://github.com/ubiquity/ubiquibot/issues/746 is the /review
command I authored and was priced at $150 and the /gpt
command is priced at $200 lmaoo, and I spent days/weeks on these. $400 max, we are simply feeding two bodies of text to GPT with a custom prompt to have it perform a review with a couple of conditionals afterwards to add a comment and update a review status.
should I implement this feature in a new file? and also what should happen if the pull is not initially opened as a draft?
@ubadineke I'm unsure personally at this point as I did not write this task specification. I have left comments and once @0x4007 replies I'll have a much better idea of how this task should be implemented.
pull_request.created
which will allow it through or kick it back into draft just like 0x4007 said. Future reviews would be tied to pull_request.ready_for_review
distanced by at least 24hrs between each using the timeline or review comments.@Keyrxng @0x4007 What if the PR is created and the related issue is not immediately mentioned in the body. We just move it back to draft right and tell the user to edit the PR and specify issue?
The existing event payload style doesn't have a field for PR Number, maybe it was just designed for comments. Can modifications be made? If the issue must be specified from the payload then the first paragraph in my comment is redundant.
I guess that's a good idea as we like to strongly enforce the linked issue although it's not a guarantee each PR has or needs this such as quick fixes for example.
I'd say don't do anything, our pull request template enforces this format and legit cases exist where it is not used.
The existing event payload style doesn't have a field for PR Number, maybe it was just designed for comments. Can modifications be made? If the issue must be specified from the payload then the first paragraph in my comment is redundant.
I'm not sure I understand but issues and PRs are both considered as an issue
by the GitHub api, there is no PR number
only issue.number
. You can get linked issues via the GraphQL API most reliably.
If a PR does not have an associated task specification then without having codebase embeddings to compare the diff against we don't have a lot of context other than what they have changed via the diff. Which can still be reviewed by GPT on it's own for logic errors and optimizations it'll just be more general review without the guard rail of the spec.
I think it would be more appropriate for you to extend your /gpt command logic. You can handle it in a separate pull. Over on your old pull I said to merge and let's test in production. From there you can extend the logic as part of this task.
This task makes it more seamlessly integrated (but looks like your previous work set a lot of relevant ground work)
This requires more deeper integration into GitHub review workflow such as by setting review states and switching between draft and non draft etc.
I have extra time because the prompt engineering alone I'm sure will take a couple days to get the results we want.
Yes let's keep this plaintext no embeddings needed.
@Keyrxng the deadline is at Fri, Oct 4, 2:47 AM UTC
/start
! This issue is already assigned. Please choose another unassigned task.
Hey @0x4007,
I noticed Keyrxng might be busy, so I went ahead and took on the task. I’ve put together an initial version of the implementation, which you can check out here.
Right now, it only triggers on pull_request.created, and I still need to add the logic for pull_request.ready_for_review. I also extended the GPT logic like you mentioned before.
I’d really like to see this through to the end! If you could create a repo for this plugin, I can make a PR and keep working on it and then we can make adjustments to the prompting.
Thanks!
@ubadineke, @Keyrxng the deadline is at Sat, Sep 28, 8:00 AM UTC
@Keyrxng you should be able to make the repo I'm not on computer
I think it would be more appropriate for you to extend your /gpt command logic
If that's the case then no new repo is required.
I noticed Keyrxng might be busy, so I went ahead and took on the task.
Not typically how things go around here, if you are unassigned from a task you shouldn't technically be allowed to work it again.
I’ve put together an initial version of the implementation, which you can check out here.
Your implementation seems to be more or less a copy/paste of my super early /review
command which if we are extending /gpt
which seems like a better idea as it has much more reach with context although it will require guidance with context in terms of review etc.
I think it's best that I resolve this task solo today by refactoring/upgrading /gpt
, apologizes @ubadineke I appreciate your enthusiasm tho.
@0x4007 you said it should reply to @ubiquityos
instead of using the /gpt
or any kind of slash command invocation?
Does that mean that you would invoke a review or review should be automate as discussed in this spec? My /gpt
QA showed me asking for it to perform a review but instead this new upgraded functionality should be as per this spec, reviews built-in and triggered on events.
Ideally we keep within /gpt
but have it now respond to a direct mention. We keep it within because of the context handling that it has, we can easily customize it to allow only x-depth of referenced context through instead of the unlimited depth that /gpt
was made to have. As opposed to rebuilding another custom context collection system etc.
I think unlimited depth is interesting to experiment with. I always side on providing more context to llms
Let's not use that slash command and instead replace it to respond to its @UbiquityOS tag
reviews built-in and triggered on events.
Sounds good
We have waves of contributors that open pulls sometimes keeping our review team busy.
We can save valuable man hours by having the bot preemptively check if the pull achieves the specification.
Pull Flow
Bonus but maybe can be handled in other tasks:
We should use o1 but if it's not available in the API, either 4o or 3.5 sonnet should do.