1712n / challenge

Challenge Program
65 stars 27 forks source link

Fact-checking GitHub Action #96

Closed evgenydmitriev closed 1 year ago

evgenydmitriev commented 1 year ago

This challenge is about building a fact-checking solution for a wiki that contains information about historical attacks on distributed networks, as well as summary articles describing general blockchain security concepts. To participate, submit a pull request with a GitHub Action that fact-checks new additions to the content directory and comments corresponding pull requests from content contributors with the list of lines that seem to be made up. Feel free to throw anything you want at it: ChatGPT, Bard, custom LLMs, etc. However, keep in mind that the submissions will be evaluated based on the combination of cost efficiency and EER.

When ready to submit your pull request, request a review from this issue assignee. Expanding the pull request description with your methodology can help us better understand your reasoning and evaluate your submission faster. To make sure your submission doesn't get lost, you can also email your pull request link along with your resume and the link to this challenge to challenge-submission@blockshop.org. Don't hesitate to ask us questions by commenting in this issue.

orzhan commented 1 year ago

@Lavriz Please review my solution. The main script is fact_check.py. It uses OpenAI and Duckduckgo to fact-check the claims from the diff in every pull request made. Github action setup requires 2 repository secrets: Github Token to post a comment, and OpenAI token.

For example, see the following pull request: https://github.com/orzhan/test-actions/pull/19 where two false claims are identified.

I couldn't create a pull request directly in dni-website repository, because it doesn't allow to create branches.

Lavriz commented 1 year ago

@orzhan congratulations on solving the challenge!🎉 The result is great! :)

marina-chibizova commented 1 year ago

@orzhan would you recommend using gpt4 in prod? or is it an overkill for simple fact-check given the stricter rate limits of gpt4?

orzhan commented 1 year ago

@marina-chibizova gpt4 may seem like overkill for this task, and it is indeed slower and more expensive compared to gpt3.5-turbo. However, if we later find out that gpt3.5-turbo is making mistakes, then it would make sense to consider gpt4 as an alternative solution.