Open tassoman opened 2 years ago
A few years ago, you decided for insensitive hashtags ( #3761 )
You can see the i is different between the two hashtags.
Yes, sorry, I've just seen now. 👀 If you don't mind to transliterate maybe we can ignore this issue... 🤔 Searches got transliterated, already
You can see the i is different between the two hashtags.
Now we have this popular hashtag with the word "Mercoledì" (wednesday) that is happening every week so it's quite a mess because somebody use the common I others ì. It would be really good to have ì transliterated to i. twitter is returning tweets with both #mercoledi & #mercoledì:
https://twitter.com/hashtag/mercoledi
there're other vocals that has to be transliterated:
è é => e ù => u à => a ò => o
and those are only for italian language.
Sorry for being an ugly person, I've found something using javascript on the StackOverflow: https://stackoverflow.com/a/2128054 Maybe using the Django urlify javascript solution could work?
I found the data provider for trending tags, by reading the relative javascript action
So I bet the /api/v1/trends/tags rest resource controller, should transliterate data output.
{
"name":"crushDelMercoledì",
"url":"https://mastodon.uno/tags/crushDelMercoled%C3%AC",
"history":[{"day":"1649894400","accounts":"1","uses":"1"}]
}
if you search for "crush", you can only see the translitered result (having itself wrong accounts sums, 6 and 8) ... ❓
Then, in the tags pages, we have both entries, with different toots:
The same problem happen with toots about the Icelandic singer Björk due to missing "ö" in our italian Desktop keyboard, so on mastodon we have different timelines about the same artist:
Search index in Elasticsearch uses ascii normalization, but the database doesn’t. It’s not trivial to update the database schema in this case which is why it hasn’t been done/prioritized, but it would be very nice to do.
I had the same intuition, more, I can't do anything on Ruby.
I think this issue can be collected for future use, if a mayor refactoring is going to be planned.
What if using elasticsearch for hashtags also?
Steps to reproduce the problem
Expected behaviour
trending stats should be aggregated and lowercase
Actual behaviour
trending stats of the same hastag are divided by camels
Specifications
Browser: Firefox OS: Windows