cofacts / rumors-api

GraphQL API server for clients like rumors-site and rumors-line-bot
https://api.cofacts.tw
MIT License
112 stars 27 forks source link

feat(graphql): remove apparent hallucination #323

Closed MrOrz closed 1 year ago

MrOrz commented 1 year ago

In reality, even if a voice repeats a certain phrase multiple times, the probability (confidence) of speech recognition should not be the same.

However, for hallucinated text, sometimes the text and probability just repeats themselves. Example results: https://docs.google.com/spreadsheets/d/10xfkOZpGJ-9vIvoYziEkD1lZETWMbBLDT-NABdQ8H_g/edit#gid=0&range=32:34

By removing segments with the same text and probability, we can reduce the hallucination by around 50%.

We also correct the Whisper prompt. Whisper, unlike ChatGPT, is not instruction tuned. It is meaningless to provide commands inside its prompt. Therefore, we just include the bare minimum text to lead the transcript to use Taiwanese Mandarin (ex: 網際網路 instead of 互联网、影片 instead of 视频) and full-width punctuations.

Examples

Before

image

After

Some hallucination is being deduped, but still some remains image

Before

image

After

Most hallucination is being deduped, but the sentence still repeats itself for one time. image

Known issue

This does not help for video without any voice, as its hallucination does not repeat itself. image