Open grinnellian opened 4 years ago
I could see this being implemented as a simple “90% of tokens are the same” or similar, but anything more sophisticated falls into the NLP realm where information duplication detection is a field of ongoing research and probably outside the scope of this bot.
I often see cases where multiple people attempt to RT w/ #defendPDX (Heck, I've done it myself before I learned to get better at checking other retweets first) However, slowing down to check for other RTs reduces the utility of the repeater somewhat.
How would you like to handle duplicates and deduplication? Some ideas/cases to think about: