Open JustTheCoolest opened 2 months ago
The prompt asks it particularly to read receipts. So when we don't give a receipt, it's ~improvising~ hallucinating.
Possible solution (via multi step prompt engineering, perhaps using chat instead of just prompt):
1) Ask Gemini to classify the image as a receipt and not a receipt.
Implement this using function calling.
If not a receipt, use Streamlit to give a GUI based response
2) Show a transcript of the text detected in the image. Just to increase the output length and a demonstration of Gemini's capabilities
3) Give the user the summary like before.
(in collab with @harshadatta123 and @ShivaniVinod )
The prompt asks it particularly to read receipts. So when we don't give a receipt, it's ~improvising~ hallucinating.
Possible solution (via multi step prompt engineering, perhaps using chat instead of just prompt):
1) Ask Gemini to classify the image as a receipt and not a receipt.
Implement this using function calling.
If not a receipt, use Streamlit to give a GUI based response
2) Show a transcript of the text detected in the image. Just to increase the output length and a demonstration of Gemini's capabilities
3) Give the user the summary like before.