CastagnaIT / plugin.video.netflix

InputStream based Netflix plugin for Kodi
MIT License
1.91k stars 259 forks source link

The Spanish audio/subtitles are not automatically selected when the available track is for European Spanish (es-ES) #11

Closed Paco8 closed 5 years ago

Paco8 commented 5 years ago

I reported the problem for the 0.13.x version of the addon here: https://github.com/asciidisco/plugin.video.netflix/issues/605 (although in the bug report was about the subtitles, the issue affects the audio as well)

I'm afraid this version has the same problem.

It seems the bug is actually in Kodi, but since it's a very important problem for people from Spain, it would be great if a workaround for the problem could be implemented in the addon.

Multiconecta commented 5 years ago

I have the same problem. With Brazilian Portuguese, that Netflix classify as pt-BR, European Spanish would be es-ES. The problem is Kodi will understand all 2-letter-language-code hyphen 2-letter-region-code as invalid (Unknown). That's really odd, as we can set "Portuguese (Brazil)" and "Spanish (Spain)" in audio or subtitles player settings, but there is no 2-letter or 3-letter ISO standard code for those languages variation (I've heard simplified Chinese, or Luxembourg Dutch also have the same problem).

Just to start the discussion, we could use a change in code when getting streams manifest from Netflix, something like (and similar to subtitles also):

MSLv2.py: lang = audio_track['language'][:2] if len(audio_track['language']) == 5 and audio_track['language'][3:1] == '-' else audio_track['language']

That would change all audio tracks that are 'pt-BR' to 'pt', all 'es-ES' to 'es' and so on. Well, it does not really solve the problem, but the tracks will be shown as 'Portuguese' or 'Spanish' language. If you select your language to 'Spanish' instead of 'European Spanish', the track would be selected. Of course you get a great chance to hear Latin American Spanish even if you get both, so it is not perfect yet.

The best solution would be Kodi core "understand" this "other languages", using 'pt-BR' code or 'pob' or 'ptb'. Those are not really an ISO language code, but are being used in many ways. If we can set those languages in Kodi, it should have some code for understanding it.

Multiconecta commented 5 years ago

Maybe using languagecodes section in advancedsettings.xml we could find a solution. The languages find and compare functions of Kodi core uses this user custom language codes, maybe we can use it for the issue. That code also have regions codes (that would be the "-BR", "-ES", "-LA". But I did not find if it is being used somehow.

CastagnaIT commented 5 years ago

I have read the thread but at the moment i am working on another thing I'll consider some sort of changes later

The best solution, is that kodi should be converted to use the standard IETF BCP 47 language tag that support all language + all region code

helviojr commented 5 years ago

That's ok, @CastagnaIT . I will try to find something to work on.

I've done some testing, like using any 2-letter or 3-letter code in advancedsettings.xml, as in

  <languagecodes>
    <code>
      <short>pb</short>
      <long>Portuguese (Brazil)</long>
    </code>
    <code>
      <short>esp</short>
      <long>Spanish (Spain)</long>
    </code>
  </languagecodes>

And renaming the Netflix codes in in MSLv2.py:

lang='pb' if audio_track['language']  == 'pt-BR' else 'esp' if audio_track['language'] == 'es-ES' else audio_track['language'],

for audio tracks and:

lang='pb' if text_track.get('language') == 'pt-BR' else 'esp' if text_track.get('language') == 'es-ES' else text_track.get('language'),

for subtitles tracks.

The result is that the tracks languages would be correctly identified if you open the list of audio/subtitle languages. The problem is, it is not enough to Kodi choose that track based on default audio or subtitle language settings. Kodi core should start using the languagetag-regionsubtag model for choosing languages to show tracks or download subtitles, for example (I'm not sure if subtitles downloading scripts already do that). Another benefit would be automatically choosing a generic track if specific is not present (choose 'es' if 'es-ES' does not exist).

(by the way, I am the same Multiconecta above, but this is my personal profile, I will use to this project from now on).

helviojr commented 5 years ago

@CastagnaIT , could you help me with just one thing? When you start playing, the addon just send the complete manifest to Kodi (all tracks) and Kodi chooses which language to play and which subtitle to show (or to not show subtitles at all)? I mean, the addon does not choose, just present all info?

CastagnaIT commented 5 years ago

is not so immediate, when starting a video, the addon elaborate an mpd file (using the manifest infos) to sends to inputstream, this file containing the streaming of the audio and subtitles (and video), but not all streams, some are filtered out. After this inputstream communicates with Kodi.

you can instruct inputstream by tagging which stream to set by default. but currently setting the default stream for subtitles is problematic, there are several cases to study and a problem with the inputstream which can not be set by default a "forced" subtitle stream.

after setting the default stream, kodi has its own logic of selection, which is based on what you have chosen in the subtitle (or audio) settings of kodi.

use advancedsetting.xml can be a workaround, but not practicable, not all people are experienced in editing files, and in devices such as android or linux an inexperienced user will have no few difficulties.

helviojr commented 5 years ago

use advancedsetting.xml can be a workaround, but not practicable, not all people are experienced in editing files, and in devices such as android or linux an inexperienced user will have no few difficulties.

I get it. Ideally would be this already standard ll-cc to be recognized by Kodi. Doesn't seem to be a too big problem, of course it would be impossible to list all languages in all countries, but the list can be narrowed to existing languages (that can be extended in each version or via advancedsettings.xml).

There is one thing that intrigues me: at least in my Kodi, I have the possibility to choose "Portuguese (Brazil)" without adding it manually, so there is this "language" somewhere in Kodi. As there is no ISO code for it, the tracks cannot be find in this language. Anyway, it is a Kodi core issue, all we can do here is a workaround.

helviojr commented 5 years ago

And renaming the Netflix codes in in MSLv2.py:

Just realized I was using an old version. Now changed to master branch. The correct files, instead of MSLv2.ph would be resources\lib\services\msl\converter.py

helviojr commented 5 years ago

Some findings (not a solution, but a workaround for some cases), a long text with some details:

  1. Some Kodi functions uses ll-CC to discover language and region
  2. Function CLangCodeExpander::Lookup in xbmc/utils/LangCodeExpander.cpp does look for a hyphen and lookup region code, so the code confirms that there is the standard IETF BCP 47 (2-letter ISO639 language code hyphen 2-letter ISO3166 country code).
  3. The problem is that the function look for the region in ISO639 table. That is quite wrong, as it is a table of language codes, and, after the hyphen, there is always a country/region code (ISO3166-1). For example, 'pt-BR' is being expanded to "Portuguese - Bretan"
  4. The interesting thing is that I can create a custom language 'BR' = 'Brazil' with advancedsettings.xml, that way, the tracks 'pt-BR' would be recognized as "Portuguese - Brazil" (but it would not work for every case, as, if we created a custom language 'ES' = 'Spain', all Spanish tracks would be renamed, 'es-ES'="Spain - Spain" and 'es'="Spain").
  5. Well, that's another problem: Kodi stores language/country pairs as English text "Language (Country)", for example, "Portuguese (Brazil)", but CLangCodeExpander::Lookup would expand 'pt-BR' as "Portuguese - Brazil". Clearly, there should be a standard for that. The track will not be chosen.
  6. The solution for my case, was to create second custom language 'pb'="Portuguese - Brazil" and choosing "Portuguese - Brazil" instead of "Portuguese (Brazil)" in Kodi's subtitle language setting.
  7. Finally, the last problem: Kodi is choosing the "forced" subtitle track instead of the normal one. That, I couldn't realize why.

Once again, the solution will be the correction of Kodi core's CLangCodeExpander::Lookup function to look for region in a correct ISO 3166-1 table. And, as Kodi chooses track languages using the expanded English text, it will be necessary to create a standard to format language+region inside it.

MediaBrasil commented 5 years ago

@helviojr Read this issue listed on Kodi Master https://github.com/xbmc/xbmc/issues/15308 It's a hack but work.

Regards, Wanilton

CastagnaIT commented 5 years ago

@helviojr what do you think of this conversion table used on the amazon plugin? can I just implement it the way it is? see line 34, _AdjustLocale(): https://github.com/Varstahl/xbmc/blob/master/plugin.video.amazon-test/resources/lib/proxy.py

taking into account that I might be put it on as an configurable option to the settings, i have two questions, because i have not fully understood how the amazon code works

1-example 'es-es' will be converted as 'es', If there is already a language 'es' then i should eliminate it right 2-from the code 'pt-BR' will be converted as 'pt-Brazil', kodi understand 'pt-Brasil' code?!?

because from my geographical position i don't see the other languages i can't test them, not even by changing language from the netflix profile....mah...

Paco8 commented 5 years ago

I think the amazon addon checks if there's only one track for the language, in that case it removes the country code. For example if a movie only has a "pt-BR" track then it's renamed to "pt". But if there are "pt" and "pt-BR" then "pt-BR" is renamed to "pt-Brasil" (so kodi will use the "pt" track by default).

I think kodi doesn't understand "pt-Brasil", I think it's renamed because otherwise kodi will display pt-BR as "Portuguese - Breton" (kodi bug).

For the netflix addon, I'm using this patch: https://github.com/Paco8/plugin.video.netflix/commit/d258277f005ef0aa0e8063889a287ee55d8ab0a7

It renames "es-ES" to "es" but only if it doesn't exist a "es" track.

CastagnaIT commented 5 years ago

I have implemented the changes they use on amazon video addon, if you can all give me feedback on the behavior if it's ok, so i'm done with this..

ps.there are no options, the language code change is active immediately

CastagnaIT commented 5 years ago

sorry i made a mistake I'll put the test file in later

CastagnaIT commented 5 years ago

plugin.video.netflix_LangCodeTest01.zip

Paco8 commented 5 years ago

At least for (European) Spanish it's not working correctly. It works well if there's only one Spanish track, but if there are two (for example normal subtitles and forced subtitles in Spanish) the code conversion is not done.

There are some differences compared to amazon: in amazon "es-419" is the code for Latin American Spanish, but in netflix the code seems to be just "es". In amazon the country code is in lower case: es-es, en-us, and so on, but in netflix the country is in upper case: es-ES. I think this is what makes the conversion to fail when there are more than two Spanish subtitles.

Changing line 304 to this, seems to fix it: new_lang = p1.lower() + ('' if p1 == p2.lower() else separator + p2.upper()

I couldn't test yet what happens if a movie has both European and Latin American Spanish. I knew of a movie which had subtitles in both variants, but it looks like they removed the European one.

Edit: if there are both European and Latin American, there's a problem. The Latin American code is 'es', but the AdjustLocale function would rename 'es-ES' to 'es' as well, so in the track selection menu there's no way to tell them apart. Maybe that could be fixed with this:

    ...
    if 'es-ES' == langCode: return 'es-Spain'
    new_lang = p1.lower() + ('' if p1 == p2.lower() else separator + p2.upper())
    ...

But forced subtitles messes things up, because if there are normal and forced subtitles, there are two Spanish subtitles and both are then renamed to 'es-Spain' which means that kodi won't select either by default.

Paco8 commented 5 years ago

I think the amazon function is not completely applicable here, because there are some differences. For example it's not strange to have movies in primevideo in Spain with both European and Latin American audio and subtitles. This is unlikely in netflix.

I've modified the code, which is now simpler. It basically checks if there's already in the list a language code without the country (es, pt...) in that case it leaves it that way and it doesn't rename es-ES to es or pt-BR to pt, although it changes it to es-Spain and pt-Brazil, to prevent kodi to show something like "Portuguese - Breton".

Otherwise it removes the country code and leaves only the language code.

def _fix_locale_languages(data_list):
    """"""
    # Count the number of duplicates with the same language codes
    langCount = {}
    for item in data_list:
        if item.get('isNoneTrack',False):
            continue
        lang_code = item['language']
        if lang_code not in langCount:
            langCount[lang_code] = 0
        langCount[lang_code] += 1
    common.debug('langCount: ' + str(langCount))
    common.debug('data_list: ' + str(data_list))

    # Conversion table to prevent Kodi to display pt-BR as Portuguese - Breton and so on
    locale_conversion_table = {
        'es-ES': 'es-Spain',
        'pt-BR': 'pt-Brazil',
        'fr-CA': 'fr-Canada'
    }

    # Replace the locale languages to the tracks with a new one
    for item in data_list:
        if item.get('isNoneTrack',False):
            continue
        locale_code = item['language']
        lang_code = locale_code[0:2]
        if lang_code in langCount and langCount[lang_code] > 0:
             if locale_code in locale_conversion_table.keys():
                 item['language'] = locale_conversion_table[locale_code]
        else:
            item['language'] = lang_code
CastagnaIT commented 5 years ago

I think the amazon function is not completely applicable here, because there are some differences. For example it's not strange to have movies in primevideo in Spain with both European and Latin American audio and subtitles. This is unlikely in netflix.

I've modified the code, which is now simpler. It basically checks if there's already in the list a language code without the country (es, pt...) in that case it leaves it that way and it doesn't rename es-ES to es or pt-BR to pt, although it changes it to es-Spain and pt-Brazil, to prevent kodi to show something like "Portuguese - Breton".

Otherwise it removes the country code and leaves only the language code.

def _fix_locale_languages(data_list):
    """"""
    # Count the number of duplicates with the same language codes
    langCount = {}
    for item in data_list:
        if item.get('isNoneTrack',False):
            continue
        lang_code = item['language']
        if lang_code not in langCount:
            langCount[lang_code] = 0
        langCount[lang_code] += 1
    common.debug('langCount: ' + str(langCount))
    common.debug('data_list: ' + str(data_list))

    # Conversion table to prevent Kodi to display pt-BR as Portuguese - Breton and so on
    locale_conversion_table = {
        'es-ES': 'es-Spain',
        'pt-BR': 'pt-Brazil',
        'fr-CA': 'fr-Canada'
    }

    # Replace the locale languages to the tracks with a new one
    for item in data_list:
        if item.get('isNoneTrack',False):
            continue
        locale_code = item['language']
        lang_code = locale_code[0:2]
        if lang_code in langCount and langCount[lang_code] > 0:
             if locale_code in locale_conversion_table.keys():
                 item['language'] = locale_conversion_table[locale_code]
        else:
            item['language'] = lang_code

I understand the logic described, but there are things wrong with the code you posted, for example langCount is always > 0 then the else part will never be executed, now i try to look at what to do

CastagnaIT commented 5 years ago

I fixed the code but the main problem is that langCount counting fails because we have more complex situations. an example to the audio tracks, are divided by channels and audio-descriptions es-ES ch5.1 [audio-description] es-ES ch5.1 es-ES ch2.0 [audio-description] es-ES ch2.0 in this case (for now) the count shoots 4, and the conversion in some case fail, also the subtitle tracks have same problem, there is multiple tracks in same locale code becouse there are the forced subtitle (depends on the cases) that double the count.

I think I've found an acceptable enough solution that works this way:

If langCode contain only language code part, return same [end]

# If langCode contain language+country, go to conversion table:
#------------> if there is a corresponding value, convert the locale value [end]
#------------> if there isn't a corresponding value:
#------------------> and a locale without country not exists (eg. 'es'), return only the language part [end]
#------------------> and a locale without country exists (eg. 'es'), return again language+country [end]
#-----------------------(in this case the user should report that the mapping is missing, and need insert it)

this way, i just need to know if there is or isn't a locale without country, without counting, which fails for the problem mentioned above.

I'll finish the code in the day, then I'll put the zipper here, then give your opinion

CastagnaIT commented 5 years ago

I did a couple of tests, it seems to me to be okay, can you confirm?

plugin.video.netflix_LangCodeTest02.zip

mansig88 commented 5 years ago

I have tried several chapters from Spain and it seems to work, many thanks!

Paco8 commented 5 years ago

This version has the same problem than the first version. The function _AdjustLocale receives the country code in upper case (es-ES) so the comparison p1 == p2 will fail, and so es-ES is converted to es-Spain. However Kodi only selects a language by default if it doesn't contain the country code, so the function has to try to avoid to return the country code whenever is possible.

I would simplify the function as this:

def _AdjustLocale(locale_code, lang_code_without_country_exists):
    locale_conversion_table = {
        'es-ES': 'es-Spain',
        'pt-BR': 'pt-Brazil',
        'fr-CA': 'fr-Canada',
        'ar-EG': 'ar-Egypt',
        'nl-BE': 'nl-Belgium'
    }

    language_code = locale_code[0:2]

    if not lang_code_without_country_exists:
        return language_code
    else:
        if locale_code in locale_conversion_table:
            return locale_conversion_table[locale_code]
        else:
            return locale_code

So, for example, if there are one or several "es-ES" tracks (but no "es"), all of them are converted to "es". If there's an "es" track but also one or more "es-ES", the "es" is left as is, and the "es-ES" as converted to "es-Spain".

mansig88 commented 5 years ago

yes, in some tv shows like Suburra all tracks audio are Spanish or Spanish - Spain , Spanish is latin and automatically choose Spanish and no Spanish-Spain Tracks

mansig88 commented 5 years ago

This version has the same problem than the first version. The function _AdjustLocale receives the country code in upper case (es-ES) so the comparison p1 == p2 will fail, and so es-ES is converted to es-Spain. However Kodi only selects a language by default if it doesn't contain the country code, so the function has to try to avoid to return the country code whenever is possible.

I would simplify the function as this:

def _AdjustLocale(locale_code, lang_code_without_country_exists):
    locale_conversion_table = {
        'es-ES': 'es-Spain',
        'pt-BR': 'pt-Brazil',
        'fr-CA': 'fr-Canada',
        'ar-EG': 'ar-Egypt',
        'nl-BE': 'nl-Belgium'
    }
    try:
        p1, p2 = locale_code.split('-')
    except:
        p1 = locale_code
        p2 = locale_code

    if not lang_code_without_country_exists:
        return p1
    else:
        if locale_code in locale_conversion_table:
            return locale_conversion_table[locale_code]
        else:
            return locale_code

So, for example, if there are one or several "es-ES" tracks (but no "es"), all of them are converted to "es". If there's an "es" track but also one or more "es-ES", the "es" is left as is, and the "es-ES" as converted to "es-Spain".

Paco, which file can I modify with this function to works; thanks!

Paco8 commented 5 years ago

The file is resources/lib/services/msl/converter.py. I attach the version with my modification: converter.zip

If there are tracks with both "es" (Latin American) and "es-ES" (Spanish from Spain) the function gives preference to the Latin American track (otherwise I guess there'll be complains from Latin American users). In my case the show Suburra only has tracks in Spanish from Spain, in what country are you?

mansig88 commented 5 years ago

I'm in Spain, but I have both Spain in suburra..... How is possible?

Paco8 commented 5 years ago

I don't know. I'm also in Spain, I tried with the first episode of season 1 and the first episode of season 2 and I only get audio tracks in Spanish from Spain.

mansig88 commented 5 years ago

With you converter.zip works Spain-España?

mansig88 commented 5 years ago

Doesn't works for me; you can see in my attach a lot of tracks in suburra 2x01 Tracks

thanks!

Paco8 commented 5 years ago

Doesn't works for me; you can see in my attach a lot of tracks in suburra 2x01

In my case that episode doesn't have a Latin American track, only European Spanish, which is renamed by the function to Spanish, and works as expected.

suburra_audio

Are you using a VPN or something?

The current function gives preference to Latin American Spanish (considering that in Spain mainly tracks with European Spanish are used). In order to give preference to the European Spanish if both tracks exist, then I guess there'll be necessary to add an option in the addon setting to choose which version the user wants to use.

CastagnaIT commented 5 years ago

oh wait i read the wrong response..

CastagnaIT commented 5 years ago

approved your semplify code, i have not taken into account the problem that kodi does not select tracks with country code. I do not like this technique very much, i hope it is one of the first fixes that will make in the next version of kodi...

mansig88 commented 5 years ago

Hi Paco, no I don't use VPN.... :'( Do you know how can I do to gives preference to Spanish Spain instead of Latin American Spanish? Thanks!

Paco8 commented 5 years ago

Hi Paco, no I don't use VPN.... :'(

Weird. Is this show? https://www.netflix.com/title/80081537 For some reason I don't get the Latin American tracks.

Do you know how can I do to gives preference to Spanish Spain instead of Latin American Spanish? Thanks!

Try the converter.py that I'm attaching.

It adds these lines to swap the Spanish and Latin American codes:

        if item['language'] == 'es': item['language'] = 'es-Latinoamerica'
        elif item['language'] == 'es-ES': item['language'] = 'es'

converter.zip

mansig88 commented 5 years ago

I have tested other shows, and Spain Latinoamerica is the default.... With this converter.py Spain-Spanish is default!!!! perfect!! Muchas gracias Paco ;)