dawoudt / JustWatchAPI

Python 3 JustWatch.com API - https://justwatch.com
MIT License
326 stars 45 forks source link

country 'LV', Latvia, returns no results on any shows #28

Closed juliangaal closed 5 years ago

juliangaal commented 5 years ago
just_watch = JustWatch(country='LV')
results = just_watch.search_for_item(query=<ANY_MOVIE>)

returns no results. Which is strange since Locale.parse('und_{}'.format('LV')).language returns 'lv', which is the correct language code, as far as I know. What else could be going wrong? Thanks a lot!

Does this mirror #26? Why would it work with, e.g. de_DE for Germany, but not lv_LV for Latvia? I tried to apply the fix in #26, but no change!

Edit: Same issue for 'SV', Sweden

draogn commented 5 years ago

I've had a dig through and the javascript for the webpage has a mapping table rather than getting the details from a separate call. There are translations for de, it, pt, es, fr, ru, ko, ja, fi - everything else is in en. So the code as it stands doesn't work for any other countries with native languages not on that list that aren't english.

I've tried doing a call for LV overwriting the language to en (so it calls 'en_LV') and it definitely returns results.

Anyone with any good suggestions on the neatest way to fix? I don't like the idea of trying to pull in the javascript and processing. Baking the table in would break if they add further language support.

juliangaal commented 5 years ago

Ah I see my mistake. I simply did

from justwatch import JustWatch
import json

just_watch = JustWatch(country='LV')
just_watch.language = 'en'
results = just_watch.search_for_item(query='Friends')
print(json.dumps(results, indent=2))

And got no results. But language is already applied in the constructor! Stupid me!

Just out of curiosity: How did you get to the API calls? I couldn't find anything on their website about an API. Or do you work for them? ;)

draogn commented 5 years ago

You can have a look at what a website is doing by right clicking on a page and selecting Inspect Element in Firefox - it's called something similar in Chrome. You can then click the Network button then the XHR button. Then either click something in the webpage or reload the page. You'll see all the interactions listed, whether it was a GET, POST etc and what the parameters and response was. Pasting the data into one of the online json beautifiers can help.

If that doesn't get you what you want and you know it's definitely there then you may have to dig through the HTML and or javascript to find what you're looking for.

draogn commented 5 years ago

I've stumbled upon where to get the correct locale - apis.justwatch.com/content/locales/state

import requests
r = requests.Session()
HEADER = {'User-Agent':'JustWatch Python client (github.com/dawoudt/JustWatchAPI)'}
api_url = 'https://apis.justwatch.com/content/locales/state'
rr = r.get(api_url, headers=HEADER)
results = rr.json()
print(results[0].keys())
print([(i['exposed_url_part'], i['full_locale']) for i in results])

gives

dict_keys(['exposed_url_part', 'is_standard_locale', 'full_locale', 'i18n_state', 'iso_3166_2', 'country', 'currency', 'currency_name', 'country_names'])

[('us', 'en_US'), ('de', 'de_DE'), ('br', 'pt_BR'), ('au', 'en_AU'), ('nz', 'en_NZ'), ('ca', 'en_CA'), ('uk', 'en_GB'), ('za', 'en_ZA'), ('ie', 'en_IE'), ('nl', 'en_NL'), ('lt', 'en_LT'), ('se', 'en_SE'), ('th', 'en_TH'), ('pt', 'pt_PT'), ('hu', 'hu_HU'), ('bg', 'bg_BG'), ('no', 'en_NO'), ('ru', 'ru_RU'), ('ee', 'en_EE'), ('lv', 'en_LV'), ('in', 'en_IN'), ('ch', 'de_CH'), ('at', 'de_AT'), ('my', 'en_MY'), ('id', 'en_ID'), ('ph', 'en_PH'), ('ve', 'es_VE'), ('hk', 'en_HK'), ('tw', 'en_TW'), ('sg', 'en_SG'), ('vn', 'vi_VN'), ('pl', 'pl_PL'), ('fi', 'fi_FI'), ('dk', 'en_DK'), ('ro', 'ro_RO'), ('co', 'es_CO'), ('es', 'es_ES'), ('tw-zh', 'zh_TW'), ('fr', 'fr_FR'), ('kr', 'ko_KR'), ('it', 'it_IT'), ('mx', 'es_MX'), ('jp', 'ja_JP')]

So an initial search against this for lv will give back 'en_LV' from 'full_locale'

Probably just need to update the init to try to get this data and fall back on the current derivation if not found.

BongenGitHub commented 5 years ago

I've stumbled upon where to get the correct locale - apis.justwatch.com/content/locales/state

import requests
r = requests.Session()
HEADER = {'User-Agent':'JustWatch Python client (github.com/dawoudt/JustWatchAPI)'}
api_url = 'https://apis.justwatch.com/content/locales/state'
rr = r.get(api_url, headers=HEADER)
results = rr.json()
print(results[0].keys())
print([(i['exposed_url_part'], i['full_locale']) for i in results])

gives

dict_keys(['exposed_url_part', 'is_standard_locale', 'full_locale', 'i18n_state', 'iso_3166_2', 'country', 'currency', 'currency_name', 'country_names'])

[('us', 'en_US'), ('de', 'de_DE'), ('br', 'pt_BR'), ('au', 'en_AU'), ('nz', 'en_NZ'), ('ca', 'en_CA'), ('uk', 'en_GB'), ('za', 'en_ZA'), ('ie', 'en_IE'), ('nl', 'en_NL'), ('lt', 'en_LT'), ('se', 'en_SE'), ('th', 'en_TH'), ('pt', 'pt_PT'), ('hu', 'hu_HU'), ('bg', 'bg_BG'), ('no', 'en_NO'), ('ru', 'ru_RU'), ('ee', 'en_EE'), ('lv', 'en_LV'), ('in', 'en_IN'), ('ch', 'de_CH'), ('at', 'de_AT'), ('my', 'en_MY'), ('id', 'en_ID'), ('ph', 'en_PH'), ('ve', 'es_VE'), ('hk', 'en_HK'), ('tw', 'en_TW'), ('sg', 'en_SG'), ('vn', 'vi_VN'), ('pl', 'pl_PL'), ('fi', 'fi_FI'), ('dk', 'en_DK'), ('ro', 'ro_RO'), ('co', 'es_CO'), ('es', 'es_ES'), ('tw-zh', 'zh_TW'), ('fr', 'fr_FR'), ('kr', 'ko_KR'), ('it', 'it_IT'), ('mx', 'es_MX'), ('jp', 'ja_JP')]

So an initial search against this for lv will give back 'en_LV' from 'full_locale'

Probably just need to update the init to try to get this data and fall back on the current derivation if not found.

Hello all, I updated my init based on your fix and it now works, here is the adjustment:

class JustWatch:        
    def __init__(self, country='AU', use_sessions=True, **kwargs):
        self.kwargs = kwargs
        self.country = country
        self.language = Locale.parse('und_{}'.format(self.country)).language
        self.kwargs_cinema = []
        if use_sessions:
            self.requests = requests.Session()
        else:
            self.requests = requests
        self.locale = self.get_locale(self.country)

    def get_locale(self, country):
        api_url = 'https://apis.justwatch.com/content/locales/state'
        r = self.requests.get(api_url, headers=HEADER)
        results = r.json()
        for i in results:
            if i['iso_3166_2'] == country:
                return i['full_locale']
        return self.language + '_' + self.country
juliangaal commented 5 years ago

Looks great, thanks a lot. I'll test it asap