harshadatta123 / Multilanguage-Document-Extraction-By-Gemini

0 stars 0 forks source link

[Bug] Hallucinations #2

Open JustTheCoolest opened 2 months ago

JustTheCoolest commented 2 months ago

The prompt asks it particularly to read receipts. So when we don't give a receipt, it's ~improvising~ hallucinating.

Possible solution (via multi step prompt engineering, perhaps using chat instead of just prompt):

1) Ask Gemini to classify the image as a receipt and not a receipt.

Implement this using function calling.

If not a receipt, use Streamlit to give a GUI based response

2) Show a transcript of the text detected in the image. Just to increase the output length and a demonstration of Gemini's capabilities

3) Give the user the summary like before.

JustTheCoolest commented 1 month ago

WhatsApp Image 2024-07-17 at 22 24 38_80a5a393

(in collab with @harshadatta123 and @ShivaniVinod )