progaem / a4iFfki

Achievements Telegram bot
MIT License
3 stars 1 forks source link

"Caching" stickers #7

Open lll-phill-lll opened 2 months ago

lll-phill-lll commented 2 months ago

Since our service isn't heavily loaded and the sentences aren't very long, we can safely use some kind of preprocessing for the sentences.

We can turn input sentences into a kind of embeddings and use them to evaluate the similarity between two sentences. But in this case, I'm not quite sure how to store these sentences in the database. I don't think it's a "cool production way" to compare a new embedding with every old.

We need to be able to get same representation of all the "close" sentences

Do you know if there are any existing methods for solving this kind of problem?