ksnovak / Multigame_Browser

An alternative Twitch directory, with a more customizable search. Try it out at:
https://multigame-browser.herokuapp.com/
MIT License
0 stars 1 forks source link

Sort out lang-country searching issue (e.g. "en" vs "en-gb") #93

Open ksnovak opened 6 years ago

ksnovak commented 6 years ago

This is a weird edge case. It seems like very few people specify "en-gb". But I encountered it naturally when trying to find RTGameCrowd.

When streamers set their language, they can specify it with ISO 639-1 language codes -- such as "en-us" or "de-at". Viewers on Twitch don't see this information at all, they just see the main language ("English" or "German").

When querying the API, this does make a difference, however. If you search for "en" streams, you will not get any "en-gb" results. The two have entirely separate datasets.

Test this out by manually changing the language in the querystring: Searching for en-gb: http://localhost:3001/?language=en-gb&game=League%20of%20Legends&game=Fortnite&game=World%20of%20Warcraft&game=Counter-Strike%3A%20Global%20Offensive&game=Assassin%27s%20Creed%20Odyssey&game=PLAYERUNKNOWN%27S%20BATTLEGROUNDS&game=Rocket%20League&game=FIFA%2019&game=Dota%202&game=Just%20Chatting&game=Grand%20Theft%20Auto%20V

Look at the top streamers, then try querying just en and see if they are visible at all. They won't be there.

Also note that Twitch's own Tag filtering is messed up. If you search for the English tag, it treats it just like searching for "en". It isn't a catch-all either.

ksnovak commented 6 years ago

Not every one of these codes is used. Can test this out quickly with our own api: http://localhost:3001/api/streams/top?streams_count=20&language=en just change the language.

Example tests: en-gb is used, but bz, ca, ie, jm, nz, tt, us, and za are not. es-mx is used, but none of the others are. None of the German ones are used. Nor the French ones. zh-cn and zh-tw are used, but not sg or hk.

ksnovak commented 6 years ago

In fact, if we just make a gigantic query with all hyphenated languages here and start removing the ones we see, we find that the used languages are:

en-gb, pt-br, zh-tw, zh-cn, es-mx, zh-cn. If you remove those from the query (here), there are zero search results. At least at this time.

ksnovak commented 6 years ago

Put in a fix that will query Twitch with all appropriate languages. This isn't perfect - especially for like Chinese Chinese v s. Taiwanese Chinese. But right now English is the only language allowed anyway, so we can re-address later.

This also only covers those languages that I had positive results for when testing. At a different time of day, we may find that there are other languages in the same scenario.