emericg / OpenSubtitlesDownload

Automatically find and download the right subtitles for your favorite videos!
https://emeric.io/OpenSubtitlesDownload
GNU General Public License v3.0
588 stars 64 forks source link

More languages at once #3

Closed felagund closed 12 years ago

felagund commented 12 years ago

This allows to download more languages at once. It already incorporates the two branches I requested to be pulled before (it did not occur to me before to download two languages at once). Obviously, I am scratching my itch and solving my use case, but this might be useful to someone else. The use case is the ability to understand two languages and wanting to have a choice between them when using subtitles.

It changes SubLanguageID from a string to a list and then iterates that list to download subtitles for each language in the list. By default, the list has only English, so everything works like before, but can be appended to download more files at once. If this is merged, https://github.com/emericg/opensubtitles-download/wiki [Language selection] should be updated as well.

emericg commented 12 years ago

That's a very cool feature to have indeed, but I wonder what the default behaviour should be : 1/ Download the subtitle in every language 2/ Donwload the subtitle in just the first language found (like if I want a french subtitle, but if there is none I'd like to have an english fallback) Any though on that ?

Some other things that could be useful :

felagund commented 12 years ago

For me, 1/ is definitely preferred. Video players I know can change subtitles on the fly so it is easier to decide which subtitles to use in the player itself. Even though I have a preference for English, I sometimes give my videos to some people who prefer other languages, so it is neat to be able to give them these other languages.

I am looking into the other two things, they do not seem to be hard to implement.

emericg commented 12 years ago

I notice something while digging into this, we can actually already search for a subtitle in multiple langage, we just have to set : SubLanguageID = 'eng,fr'. No other modification required.

So why not make that an option ? Let the user decide if he want 1/ of 2/ just by changing the SubLanguageID value ?

And by the way, would you want your name and email to be added to the script header ?

felagund commented 12 years ago

Hm, interesting point. I think I addressed your points from the above. Only one error message is displayed and the selection boc is improved.The language is printed, as well as whether the titles are for the hearing impaired (checkbox appears when they are, otherwise this stays blank) and whether they are more CDs subtitles (when they are, they are not likely to be of interest to the user). I also changed the title of the selection box to be the movie filename - it is useful since the user then knows whtether some of the subtitles matches the movie title exactly (maybe we could download such a subtitle automatically without asking, if available).

If the user chooses to search for multiple languages in one go (ie. 'eng,fre'), everything should work. She can even mix it ('eng,fre','ger'). So this is up to the user how she wants to use the script.

If you merge this, include my name and e-mail please.

felagund commented 12 years ago

I tried to follow your changes to the code, I hope it went well.

emericg commented 12 years ago

What is your name and e-mail by the way ? You should set your email in github so people can send you messages.

felagund commented 12 years ago

I have set it now. I thought that setting it up in git would be enough but it was not. What do you think about the patches anyway? Are the new features worth the increased complexity?

emericg commented 12 years ago

I'm ok with the idea of multiple languages at once, I think this feature will be very useful for a lot of people. I just want to make sure the script is still as simple to use as before.

I'd like to update the integrated documentation with these informations (I'll also update the wiki) :

.# You can also search for subtitles in several languages ​​at once: .# - SubLanguageID = ['eng,fre'] to download the first language available only .# - SubLanguageID = ['eng','fre']' to download all selected languages

So basically, if you choose 1/

If you choose 2/

What I'd like to clarify is the error message. Your code to unify the "not found" messages across different languages is working great, but after some "real life testing", I must say I am pretty annoyed by the fact that if I set the script to english and french, I always (like 90% of the time) have an error message that says : "English subtitle downloaded, french subtitle not found". It kind of kill the "--auto-close" commit. So I propose that we print error message only if no languages have been found.

That being said, can you take a look at this implementation, based on your commits : https://github.com/emericg/opensubtitles-download/commit/004acc3e03b3a3d80611817651c6e7a2e51a492e

Regarding other features :

I don't like the list of hardcoded ISO 639-1 language codes. I think it is best to stay generic in the implementation, because it will be impossible to keep tracks of all the languages supported by opensubtitles.org and there assosiated 2 or 3 letters codes. I see that you did that too correctly print the found / not found language. You can get back the language full name from the XML-RPC call results like that: subtitlesList['data'][subIndex]['LanguageName']

I don't really like the "hearing impaired" and "number of CDs" features, mostly because I don't use them and they cost precious space in the zenity selection windows. Besides they are empty most of the time. There is a lot of useful stuf in subtitle description, like SubFormat, SubBad, SubRating, SubDownloadsCnt, SubFeatured, SubHearingImpaired and SubActualCD. Maybe in the future we can find a way to print more informations about each subtitle file whitout crowding the subtitle selection windows ?

felagund commented 12 years ago

I have looked at your commit and it seems the functionality I added is all there but without my bloat.

Options 1/ and 2/ too seems reasonable for me.

Regarding the error message, that is pretty strange. I actually use 'eng','cze' and it finds both subtitles 90% of the time. Maybe the French use something else besides opensubtitles.org:-). But I see your point, it would get on my nerve if I saw it all the time.

I used the hardcoded OSP 639-1 codes for printing Language for subtitles that did not get downloaded (subtitlesList['data'][subIndex]['LanguageName'] does not help there since it is empty). As this is not needed now, you can get rid of that.

As for other stuff added to zenity: number of CDs matters since if the subtitles are not one CD, one is not likely to want them (I have not seen a video consisting of more than one file for ages). The same goes for Hearing Impaired (id you do hear, you are likely to not want them). I think SubFormat and SubDownloadedCnt are potentialy useful. After all, with no info, user has no idea which subtitles to download. But one can do without this information, that is true as well.

By the way, would not file like something.webm break the script? Line 60 would accept it but line 221 (in your commit) is nor prepared for it. I am too tired now to dig into this deeper though.

Altogether, as far as I am concerned, you can merge this. I would think about adding some more info to zenity when picking a subtitles file but that depends on taste I guess.

Oh, and also: it is as far as I know (I am not a native speaker), it is gramatically incorrect to speak about a "subtitle" - that is only one piece of text that appears for several second on the screen. The file and what we are downloading is called "subtitles" (that is also what Opensubtitles use in their API). It does not matter, but sounds strange.

On a last note, I have heavily modified the script to suit my particular needs, I pushed it to branch (mainly for backup) https://github.com/felagund/opensubtitles-download/tree/make_mkv in case you are interested, but I doubt it. What it does is that it bundles the subtitles with the video creating a matroska file (with peculiar naming, encoding conversion and several more silly things - it also depends on imdb-py and some other packages). But I think it is buggy, although I have laready downloaded subtitles for about 15 movies with the script.

emericg commented 12 years ago

Allright I started merging the patches. Thanks for your work on this issue!

What would be cool is to identify a video using its hash, and if no "perfect" subtitles are found, retry the search with the correct and clean video name. That way a lot of infos (number of CD, number of download, subtitles rating) would be essential to pick a good subtitles, and minimize chances to have text/audio synchronisation problems.

felagund commented 12 years ago

For one video, I got three subtitles, two were 2CD subtitles (i.e. 1/2 and 2/2), but it was only one out of ten or 15 videos I tried the script with, so it is negligible. 3CD subtitles will be even rarer I guess.

Searching with name would be cool, but so far I found subtitles for every videou I tried this script with:-).

Yeah, it looks nicer. However, just by looking at the code and not running it, will not this break on filename.i.like.dots.webm? "".join(moviePath.rsplit('.', 1)[:-1]) should work better I think. But I am no python guru and this is just by looking at the code.

Yeah, take a look at it, but I am not sure whether it is more generally applicable. And I have no idea whether it will work on other's people machines (it has lots of dependencies, I you want I can list them). I finally made it today so that I am happy about it, it simplifies my workflow a lot. Well, I guess I could just let the videos and subtitles where they get downloaded but for some reason I like them ordered and this makes everything a lot quicker. I guess after 500th or 1000th movie, I will have even saved some time:-).

emericg commented 12 years ago

I'm not sure how the script will react when encountering a multiple CD subtitles, I don't think I ever see that, but I'll keep my eyes open if I see something weird happening.

It will work on filename.i.like.dots.webm, because it search for only one dot starting at the right of the filename, that's the trick. However I too am not a python guru ^^

Speaking of feature, I came accross that today : https://github.com/RedRise/lm I think it may interest you.

felagund commented 12 years ago

Ahh, I see rsplit is a nice trick.

Thanks, I have seen it already but now I am more then happy about your script with my modifications, but I was surprised how many pieces of software there are interacting with opensubtitles.org and imdb.com - but Nautilus script is the most useful form for me, so I am happy about that.

I actually found out about ln through this article: http://www.webupd8.org/2012/03/lm-lists-imdb-information-for-your.html They run one more article on software fetching from opensubtitles.org http://www.webupd8.org/2012/02/vlsub-vlc-extension-to-search-and.html - I think they would probably run an article on this nautilus script as well if you wanted more users (if you do not feel like writing a draft of the article, I would be willing to do it).

emericg commented 12 years ago

Yes I read the same article on webudp8 today ^^ Yes it could be great to have some publicity, let's just wait until we release the next version with your changes! I already send the current git version to some of my friends in order to have some feedbacks.

I also updated the wiki, I'll give it another round of cleanup in the next few days.