Open markovial opened 11 months ago
There is a confidence column for each article, the idea being that editors can set it to a value between 0 and 1. Then either the chatbot would only use items with confidence e.g. > 0.8, or ones with low confidence could simply be removed from the dataset. Though of course that requires a decent interface with which to manipulate the data...
While going through the GPT 3.5 vs. GPT 4 comparison, the editors found that certain sources should just not be in the dataset. This would be somewhat mitigated by simply filtering out all non ai/ai safety tags/.
It is still worth thinking through a long term process to give certain trusted people an automated ability to either blacklist or add new sources.