Closed Jorman closed 4 years ago
What do you mean by language file? How does radarr/sonarr know which language is it if it is not stated in the file? I use https://github.com/mdhiggins/sickbeard_mp4_automator to cleanup my files and there is audio-default-language option there, maybe it could be a solution? There is also https://github.com/HaveAGitGat/Tdarr but its fairly new and I haven't tested it yet. Be warned however that changing/encoding/remuxing video files will mess up finding subtitles by hash. I got around it by generating hash file before the conversion but i requires modifying bazarr and the converter script.
Hi
We have to analyze the problem from beginning:
The media file that sonarr/radarr download, are made by various codec, to be simple, let's say avi and mkv, so mkv can have more than 2 audio tracks and can have label, like "english audio" and so on. Avi, if I remember well, can have only 2 audio tracks and cannot be labeled.
So, happen to me that sonarr downloaded a show with audio not tagged/labeled correctly, so when I run or when the script run this:
$ /opt/subsync/bin/subsync --cli --verbose 3 --window-size 300 --max-point-dist 1 sync --sub "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.en.srt" --ref "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.mkv" --out "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/prova.srt" --effort 1
The error is:
[*] starting synchronization /data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.en.srt
[+] sub: /data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.en.srt:0/1, type=subtitle/text, lang=eng
[+] ref: /data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.mkv:1/2, type=audio, fps=23.976023976023978
[+] out: /data/Multimedia/Serie Tv/America's Got Talent- The Champions/prova.srt
[!] select reference language
This because subsync can't read the audio label/tag, so subsync don't know what kind of audio is present, so I need to modify the command like this:
$ /opt/subsync/bin/subsync --cli --verbose 3 --window-size 300 --max-point-dist 1 sync --sub "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.en.srt" --ref "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/America's Got Talent - The Champions - 02x05 - The Champions Semi Finals.mkv" --ref-lang eng --out "/data/Multimedia/Serie Tv/America's Got Talent- The Champions/prova.srt" --effort 1
In this way I tell to subsync that the audio is in english.
Maybe exist a better way to do that, but I've this idea: When sonarr/radarr grab and import, they know what kind of language is If is possible, the best is to take that info and pass it to the subsync, obviously only when both languages, srt and audio, are the same. Is possible to search sub in a different language but make no sense sync it.
I hope to have better explained this particular problem.
I get what you mean however i don't think sonarr/radarr passes language info trough API so it would be tough to get, also we would have to know the sonarr/radarr id of the file to get it, and currently you can't pass it from bazarr.
What about a setting in ini file to set default language when no tag is found by subsync? It would be much simpler to implement and would cover most of the cases.
obviously only when both languages, srt and audio, are the same
It is not necessary, subsync translates audio if needed:
SubSync is listening to the audio track of your movie, using speech recognition engine CMUSphinx to generate text transcription. It is then translated word-by-word using dictionary (for subtitles of different language). Next words are linked with similar words in your subtitles, creating synchronization points. This points are used to fix subtitles time codes.
I don't know if directly is possible to get it, but here https://github.com/Sonarr/Sonarr/wiki/Release There's the language specification for the language, maybe crossing information with show name, id and other, is possible to make it works, but I don't know exactly.
Default language can be a workaround only when the final user use to always search 1 kind of subtitle.
The best would be to get it from sonarr/radarr, maybe I can try to make some test, I've to figure out where to start, but maybe is possible to make 1 line command to ask sonarr/radarr information, because for Release API command you need to pass episodeId (int)
so you need first to pass by https://github.com/Sonarr/Sonarr/wiki/EpisodeFile API and here you need seriesId (int)
in order to have a list of all episode in that list, now you've to find what is the episodeId by searching the filename, and so on
What do you think?
I found a solution, but I don't know how to make it works :)
Basically, for Radarr/Sonarr V3, Bazarr already have this information, inside Movies or Series you can already see the language, column "Audio Language". I'm looking inside get_series.py and get_subtitle.py to see if I find a way to add this to postprocessing phase
Maybe you have more ability than me
Ok, there's a solution, see here
https://github.com/morpheus65535/bazarr/pull/833
So with this integration is possible to pass audio language so:
Post-Processing command became python3 /opt/SubSyncStarter/SubSyncStarter.py "{{episode}}" "{{subtitles}}" "{{subtitles_language_code2}}" "{{subtitles_language_code3}}" "{{episode_language_code3}}" 0
and SubSyncStarter.py
became
audio_code3 = sys.argv[5]
[...]
command = /usr/bin/nice -n 15 location_subsync + ' --cli --verbose ' + loglevel_subsync + ' --logfile ' + '"' + logfile_subsync + '"' + ' --window-size ' + window_size + ' --max-point-dist ' + max_point_dist + ' sync --sub ' + '"' + sub_file + '"' + ' --ref ' + '"' + reference_file + '"' + ' --ref-lang ' + audio_code3 + ' --out ' + '"' + sub_file + '"' + ' --effort ' + effort + ' --overwrite'
That's great, i will update the script once it is merged to bazarr.
Yep! What about if you make a pull request with your modifications? So, when merged your script will work without make any change and always survive
Added audio language, also i rewritten the script to use bazarr API to blacklist subtitles. I created a pull request in bazarr to add necessary variables to custom post-processing.
Sometimes, the track audio from a file, don't have any "definition" I mean is not coded in any way that is English or other language. When this happens, a new parameter have to be used
subsync --cli --verbose 3 --window-size 300 --max-point-dist 1 sync --sub file.eng.srt --ref file.mkv **--ref-lang eng** --out aaa.srt --effort 1
So--ref-lang eng
make the differenceDo you know if there's a way to take the language file from radarr/sonarr and pass this argument to subsync? Or maybe you know another way to do this
J