LLM-Powered Issue Suggestion

bbertucc commented 4 months ago

Overview

This bountied issue is for the development of our LLM feature.

We've been discussing an LLM integration for a while. I've even created two ChatGPT GPTs to test out ideas. Now, I'm hoping we can deliver our LLM feature to production.

Feature Ideas

The most useful integration with LLMs is to create GitHub tickets from Equalify content. I've been testing creating issues from a single node source and creating issues from a single message with lots of nodes.

Issue from a Single Equalify Node Data

I imagine we could create a new "Actions" column and a button that says "Suggest Issue" in that column of each node list. Example:

Screenshot of the nodes list with an actions column and button that says Suggest Issue

Here's an example of an issue I created from a single node in Equalify: https://github.com/WordPress/wporg-mu-plugins/issues/622

Here is the conversation I had to create that issue: https://chatgpt.com/share/174d5f0b-0feb-46da-a850-9dc5cb7eb97b

Here is the prompt behind the ChatGPT GPT I created:

The bot analyzes a provided URL, violation message, and code snippet to resolve accessibility issues by considering the entire HTML page. Output text that can be posted to a GitHub issue with the following sections: Description, Current Code, Proposed Fix, Updated Code,  Steps to Reproduce, Acceptance Criteria, How critical is this fix?, Affected Pages. Also, suggest a title for the Issue that starts with "Equalify Reports" and add a section "Reviewed by" with the body content "This ticket was automatically generated by Equalify and reviewed by the human, [your-name]."

Issue from a Single Equalify Message with Multiple Nodes

I imagine we could create a button on the top of a Message detail page that generates issue text. Here is a screenshot example:

Screenshot showing a button that says Suggest Issue next to the button that says more info on the Equalify Message detail page

Here's an example of an issue I created from a single node in Equalify: https://github.com/WordPress/wporg-mu-plugins/issues/622

Here is the conversation I had to create that issue: https://chatgpt.com/share/405e54a9-a06d-4c2d-86a3-cf4344bbcb69

Here is the prompt behind the ChatGPT GPT I created:

You are a bot that outputs a GitHub ticket to resolve issues related to a web accessibility violation message the user prompts you with. The user will attach a CSV to their message with rows of nodes. Each row contains nodes' HTML, related message, and URLs that triggered the violation. 

Output text that can be posted to a GitHub issue with the following sections: 
- Description: Brief description of the issue.
- Current Code: A link to the CSV that is uploaded
- Example Fix: Offer an example of fixing one node.
- Steps to Reproduce: Include visiting the URL and testing using a browser's web inspector and the Lighthouse tool
- Acceptance Criteria
- How critical is this fix?: Include how this violation harms the user experience and the number of nodes fixing this issue would fix, which is the number of rows in the spreadsheet. 

Also, suggest a title for the Issue that starts with "Equalify Reports" and add a section "Reviewed by" with the body content "This ticket was automatically generated by Equalify and reviewed by the human, [your-name]."

Bounty Details

Anyone interested in tackling this issue should submit details on how they would technically achieve these goals, in addition to a budget.
Submissions should have clear deliverables.
50% will be paid upfront and 50% on completion.
We'll discuss the budget and approve/deny this as a team during one of our Monday contributor meetings.
The project can have phased-out components.

bbertucc commented 4 months ago

@heythisischris expressed early interest in this bounty. Pinging him here. We definitely have to factor in our money in the bank and the fact that this is a speculative feature (a feature no one has directly said they'll pay for) when deciding on the bounty.

heythisischris commented 4 months ago

I'm definitely interested in tackling this one. Here's my proposed solution:

Overview

We'll leverage GPT-4o (possibly mini) to create GitHub tickets from individual Equalify issues. This will be exposed both on the frontend and via a developer-facing API endpoint. I'll start with the prompt Blake provided and refine as needed.

The resulting link will be a pre-populated GitHub link with a draft of the given issue (you can click to test it out!):

https://github.com/equalifyeverything/equalify/issues/new?title=test&assignee=&body=This%20is%20a%20new%20issue&labels[]=accessibility

We could even allow developers to link their GitHub account so we officially publish GitHub Issues on behalf of developers.

I'd like to take this a step further and incorporate a new "Copilot" tab which allows the user to chat with a specialized Equalify Copilot chat tool which can help developers debug their accessibility fixes, look for insights/trends in their data, set alerts for certain accessibility fixes, etc.

I'll create two new tables to store responses from both of these endpoints, issues and conversations.

I might create a new column inside of users to store the GitHub token called github_token.

I'll also apply for $25,000 of Azure credit via the Microsoft for Startups program which will effectively give us $25,000 of OpenAI credit using their Azure OpenAI service (1 year expiry).

Steps

[ ] Create POST /help/create-ticket endpoint

Request body:

{
    "codeSnippet": "<textarea name=\"input_4\" id=\"input_7_4\" class=\"textarea large\" aria-describedby=\"gfield_description_7_4\" aria-required=\"true\" aria-invalid=\"false\" rows=\"10\" cols=\"50\"></textarea>",
    "url": "https://equalify.app/",
    "repository": "https://github.com/EqualifyEverything/equalify",
    "author": "heythisischris",
    "labels": ["accessibility"],
    "template": "## Overview\nThis is the overview\n\n## Tasks\nThese are the tasks required to fix the issue\n\n##Timeline\nThis issue will be fixed in 1 week."
}

Request response:

{
    "url": "https://github.com/equalifyeverything/equalify/issues/new?title=test&assignee=&body=This%20is%20a%20new%20issue&labels[]=accessibility"
}

Responses will be stored inside of issues table.

[ ] Create POST /help/copilot endpoint

Request body:

[
    { "user": "What is the most common accessibility issue identified?" }
]

Request response:

[
    { "user": "What is the most common accessibility issue identified?" },
    { "system": "The most common issue I found was \"Element's background color could not be determined because it's partially obscured by another element\"" }
]

Responses will be stored inside of conversations table.

Budget

The requested bounty is $2,500 for roughly 40 hours of development spread across 2 weeks.

This comes out to an effective rate of $62.50/hour.

Timeline

I intend to have this completed and ready for by August 5th, 2024.

bbertucc commented 4 months ago

Thanks @heythisischris! We'll probably discuss this on our July 29 contributor call. I want to make sure we focus on getting Version 1 up before we start adding features.

kevinandrews1 commented 4 months ago

@heythisischris I read this on my phone, which is probably why the code looks gnarly listening to it with voiceover. Lol. Where does accessibility fall into this? I would love to discuss with you before too long.

bbertucc commented 3 months ago

@heythisischris what is the date of delivery for this? I'll create a milestone and add reminders for myself to check in on it. We can organize a mini release around this feature.

BaronWolfenstein commented 3 months ago

I actually have an NVIDIA developer account, so let me talk with Kaju and see if we're interested in putting a bid in on this

heythisischris commented 3 months ago

@heythisischris what is the date of delivery for this? I'll create a milestone and add reminders for myself to check in on it. We can organize a mini release around this feature.

@bbertucc Normally I would say delivery date is pushed back a week, but I think I can stretch and get this done by Friday, August 9th, 2024 (if not sooner). The goal would be to create a simple and pragmatic solution using conventional LLM services and a basic web UI to accompany it. And @kevinandrews1, we could even pave some space in building an accessible AI chatbot interface.

But I am curious about @BaronWolfenstein's potential proposal- I'm assuming it would involve training/using a custom model.

BaronWolfenstein commented 3 months ago

I'm gonna go to Scale tomorrow and work on putting a proposal up. I'm happy to talk with you more about it @heythisischris, I did send Blake some details on what constitutes the NVIDIA program.

bbertucc commented 3 months ago

@heythisischris that's an amazingly ambitious goal. Love it. I also want this to a feature we announce on it's own. So could we roll it out next week instead?

This week we're already announcing the launch of the app. Next week, we can announce the LLM. Say, next Wednesday at 11:11 CST?

I'll formally bless this feature and push the cash deposit as soon as these are done:

387
383
375
348

...which is basically to say #335 is done.

Very curious to hear from @BaronWolfenstein on his proposal tool. There's always room to expand on LLM features. @BaronWolfenstein please share any documents here. I'm a man of Open Source and quickly forget anything DMed to me.

bbertucc commented 3 months ago

@heythisischris I went ahead and assigned this to you, adding a milestone for next week delivery.

bbertucc commented 3 months ago

@BaronWolfenstein if you don't mind, add your proposal to a new issue here: #391

bbertucc commented 3 months ago

This is approved for delivery september 2.

bbertucc commented 2 months ago

@heythisischris I initiated the initial 50% payment on this one.

bbertucc commented 1 month ago

@heythisischris said he'll have this by October 4 on our call today.

kevinandrews1 commented 1 month ago

@heythisischris Can we discuss this week? Would love to hear how a11y factors in on the front end and how I can support the feature's rollout. Feel free to grab a time https://calendly.com/kevinjandrews

heythisischris commented 1 week ago

@kevinandrews1 I went ahead and booked some time for tomorrow to showcase and discuss the Equalify LLM implementation so far. Definitely want your feedback and to wrap this up as soon as possible!

heythisischris commented 6 days ago

@kevinandrews1 Here are some quick instructions for testing out the current LLM feature:

Visit https://equalify.dev.
Log into your account.
Go to Reports.
Click on any Report
Click on any Message in the Messages table.
Click on the Suggest Issue button on the last column of any row.
Confirm you've seen the modal and wait 1 minute for a result to generate.
Confirm that you can read the result.

kevinandrews1 commented 4 days ago

@bbertucc If I go to the Equalify report under your account I get a 404. Out of scope for this but I noticed it and wanted to flag.

@heythisischris, please see the video I created for context:

After pressing Suggest Issue the experience is really verbose because JAWS reads out the page title. The page title is the what appears to be the entire Axe rule. That is crazy.
The contents of the modal get read out. I recommend setting programmatic focus on the h1 and the title of the modal should reflect this. Right now the page title is the Axe rule.
- The h2 you can style however you like but I don't particularly see the purpose of making that a heading.
I would move the close button to be immediately after the h2 and inject the AI response after the close button. Currently the user has to go backward (up arrow) through the reading order which is unintuitive.
Currently there is no update for screen readers alerting them that the response is done generating. I recommend having the eta 1 minute... text using the status role and when the response is generated the text could indicate the response has been generated, also with the status role.
Dismissing the modal should return focus to the trigger, in this case the Suggest Issue button.

EqualifyEverything / equalify