crunchy-labs / crunchy-cli

👇 Command-line downloader for Crunchyroll
MIT License
607 stars 63 forks source link

Language country tag incompatibility #330

Closed exus85 closed 8 months ago

exus85 commented 9 months ago

Audio tracks and subtitles languages are tagged using the en-US ja-JP it-IT format, but emby (and maybe jellyfin) can only read the two characters IETF tag without the subregion (en ja it...).. is it possibile to change the way tracks are being tagged? I asked emby devs and they added the ja-JP tag but as soon as I told them about adding other languages they said that it's uncommon and I don't know if they'll ever add them. I'm currently using a script to edit all the tags but maybe a change is possible..? thank you!

bytedream commented 9 months ago

Hey, thanks for the suggestion. Crunchyroll provides the languages in the form of en-US, ja-JP, ... . The issue I see with supporting alternative representations like IETF one you suggest are the overlapping languages. There are multiple languages with "dialects", e.g. zh-CN, zh-HK and zh-TW are all chinese languages (full list here) and I don't know how much they differ/if it's tolerable to sum they under one one IETF language tag (as I don't speak any language which has multiple tags). Plus, the language tag may be duplicated if there is an episode with e.g. zh-CN and zh-HK which then resolves in two videos with the same IETF tag

exus85 commented 9 months ago

I understand (I don't speak any of the languages you mentioned and I don't know if there would be a real difference). Maybe you could leave the decision to the user and accept both formats so that if I set: crunchy archive -a ja-JP -a it-IT -s it-IT.. it would save the tracks as ja-JP it-IT but if I instead set: crunchy archive -a ja -a it -s it it would just pick a generic subregion from crunchy (specified inside the code) and then set the tag with the two letter format (which in most of the cases I guess it would just be the same except, maybe, for the few dialects you mentioned) thank you!

bytedream commented 8 months ago

I added the --language-tagging flag. With this, you can force the output file to only use ietf tagging (--language-tagging ietf). I also added the behavior you suggested. So, for example, if you choose ja as audio locale it will internally resolve to ja-JP but the tagging in the output file will also be ja