progaem / a4iFfki

Achievements Telegram bot
MIT License
3 stars 1 forks source link

Confirmation step logic for the stickers matching algorithm #6

Open 6b656b opened 3 months ago

6b656b commented 3 months ago

There is a very simple logic for matching stickers by prompt now: prompts should be identical.

We discussed that it should be more complicated and support cases when achievements are very similar and have the same underlying meaning, but can be described in different words in prompts. We can implement such logic with NLP models.

But I also think, that any model (even a very sophisticated one like GPT-4) sometimes will make matching mistakes because of high complexity and multi-layering of human interactions (memes, local jokes etc).

This means that the scenario for using existing stickers must have a conformation step. In this issue I want to discuss the logic of the confirmation step.

How important do you think this issue really is? To what extent can a confirmation step complicate the flow of issuing a sticker? Is it worth it? Do you have any design ideas for this solution?

lll-phill-lll commented 3 months ago

Thank you for raising this issue. The current logic for matching stickers relies on exact prompt matching, which, while straightforward, doesn't accommodate variations in language that convey similar meanings. I agree that we need a more nuanced approach and have considered integrating NLP models to handle these variations.

Leveraging NLP models could indeed enhance the system's ability to match similar achievements described differently. However, the inherent complexity of human language poses a significant challenge. Even sophisticated models like GPT-4 can occasionally misinterpret or overlook subtle nuances due to the multifaceted nature of human communication, including cultural references, idioms, memes, and local jokes.

Given this complexity, the introduction of a confirmation step seems necessary. This would allow users to confirm whether the system's suggested sticker truly matches their intended prompt, potentially reducing errors and increasing user satisfaction.

However, we must carefully evaluate the implications of this confirmation step. It could introduce additional steps in the flow, potentially complicating the user experience. Therefore, we need to consider the trade-offs between enhancing matching flexibility and maintaining a streamlined, user-friendly process.

I believe further discussion is warranted on how critical this issue is, the potential impact of a confirmation step on the user flow, and any design ideas for implementing such a solution effectively. Your insights and feedback are invaluable as we aim to refine this feature.

lll-phill-lll commented 3 months ago

Joking apart I agree that we need to make sticker caching smarter.

Since our service isn't heavily loaded and the sentences aren't very long, we can safely use some kind of preprocessing for the sentences.

We can turn input sentences into a kind of embeddings and use them to evaluate the similarity between two sentences. But in this case, I'm not quite sure how to store these sentences in the database. I don't think it's a "cool production way" to compare a new embedding with every old.

We need to be able to get same representation of all the "close" sentences

Do you know if there are any existing methods for solving this kind of problem?

lll-phill-lll commented 3 months ago

Ok, as soon as I can't read properly, I'm moving my answer to a different issue