Closed sdfg2 closed 2 years ago
allow-language-relax
is probably what you want
Not really, no. If I've got a file with a few different dubs, I don't want all of them. I just want the default one, which should hopefully be the native one for the video.
Solved with 501fcf3
Actually, it's only partly complete, I spoke to soon.
If I have set languages = eng
, the file contains audio jpn +default
and eng -default
. jpn
will get deleted. I'd like to keep it because it's +default
.
I know you're busy for the next few days. The more I think about this, the more I'm thinking it might just be easier to let sma do downmixing, not touching anything else, and then run a post process script to fetch subtitles, delete unwanted audio/subs etc, but then sma is so so close to being able to do it all. I haven't programmed in over a decade, I'm an algorithm/logic/design type person now, or I would try and help.
c9b6aca8ff6ccb30ee429b066720b5b23af5ca74
Put that together real quick, see if that works
Added a force-default
option for audio and subtitle that will bypass the usual language/disposition/unique checks if a stream is marked as default
Sorry, ended up being busier than I thought the past few days myself!
sdfg@heracles ~/scratch $ sma -i Ghost\ in\ the\ Shell\ 2.0\ \(2008\)\ \[imdb-tt1260502\]\[Bluray-1080p\]\[DTS-ES\ 6.1\]\[x264\]-MOOVEE.mkv
Manual processor started.
Python 64-bit 3.10.6 (main, Aug 3 2022, 17:39:45) [GCC 12.1.1 20220730].
Guessit version: 3.4.3.
/usr/bin/python3
Loading config file /home/askesis/sickbeard_mp4_automator/config/autoProcess.ini.
Processing file Ghost in the Shell 2.0 (2008) [imdb-tt1260502][Bluray-1080p][DTS-ES 6.1][x264]-MOOVEE.mkv
Input Data
{
"format": "matroska,webm",
"format-fullname": "Matroska / WebM",
"video": {
"index": 0,
"codec": "h264",
"bitrate": 9756363,
"pix_fmt": "yuv420p",
"profile": "high",
"fps": 23.976023976023978,
"framedata": {
"pix_fmt": "yuv420p",
"side_data_list": [
{
"side_data_type": "H.26[45] User Data Unregistered SEI message"
}
]
},
"dimensions": "1920x1080",
"level": 4.1,
"field_order": "progressive"
},
"audio": [
{
"index": 1,
"codec": "dts",
"bitrate": 768000,
"channels": 7,
"samplerate": 48000,
"language": "jpn",
"disposition": "+default-dub-original-comment-lyrics-karaoke-forced-hearing_impaired-visual_impaired-captions"
},
{
"index": 2,
"codec": "dts",
"bitrate": 768000,
"channels": 7,
"samplerate": 48000,
"language": "eng",
"disposition": "-default-dub-original-comment-lyrics-karaoke-forced-hearing_impaired-visual_impaired-captions"
}
],
"subtitle": [
{
"index": 3,
"codec": "subrip",
"disposition": "+default-dub-original-comment-lyrics-karaoke-forced-hearing_impaired-visual_impaired-captions",
"language": "eng"
}
],
"attachment": []
}
Reading video stream.
Video codec detected: h264.
Pix Fmt: yuv420p.
Profile: high.
Video codec parameters None.
Creating hevc_nvenc video stream from source stream 0.
Reading audio streams.
The following stream indexes have been identified as being copies: [] [stream-codec-combinations].
Audio detected for stream 1 - dts jpn 7 channel.
Unable to generate options, unexpected exception occurred.
Traceback (most recent call last):
File "/home/askesis/sickbeard_mp4_automator/resources/mediaprocessor.py", line 123, in process
options, preopts, postopts, ripsubopts, downloaded_subs = self.generateOptions(inputfile, info=info, original=original, tagdata=tagdata)
File "/home/askesis/sickbeard_mp4_automator/resources/mediaprocessor.py", line 858, in generateOptions
self.log.debug("Audio stream %s is flagged as default, forcing inclusion [Audio.force-default]." % (s.index))
UnboundLocalError: local variable 's' referenced before assignment
There was an error processing file Ghost in the Shell 2.0 (2008) [imdb-tt1260502][Bluray-1080p][DTS-ES 6.1][x264]-MOOVEE.mkv, no output data received
e9f15183f89b90f083226f03c5cf953da52638d4
That should fix that
Ok I didn't forget about you
https://github.com/mdhiggins/sickbeard_mp4_automator/tree/ffsubsync
This branch is the one I'm working on, still testing the ffsubsync stuff but I also included a force option, manually filtering subliminal results for forced subtitles if the option is enabled
Oh I don't expect anything instantly, please don't think that! I very much appreciate any effort you put into this. I've also been busy, won't be able to look at anything until about Friday now.
I've been finding that a number of my media files have multiple different language tracks all labelled +default
, which is infuriating. I'm actually playing with the idea of fetching the 'original language' data from TMDB API instead of relying only on what is in the file itself. I also realise it's probably getting way out of scope for this project XD
Hm that's actually probably easily doable, I'll look into things
Either way the above branch at least gives you the option for forced subtitles through subliminal so give that a try
Just checking in again to see if you got a chance to test these changes
302fa7ec388036b91a3cae1688a867e2e76aea36
Tweak the forced option so you can download both forced and standard
Sorry, I was on holiday! Just catching up now.
First attempt:
sdfg@heracles ~/scratch $ sma -i Ghost\ in\ the\ Shell\ 2.0\ \(2008\)\ \[imdb-tt1260502\]\[Bluray-1080p\]\[DTS-ES\ 6.1\]\[x264\]-MOOVEE.mkv
Manual processor started.
Python 64-bit 3.10.6 (main, Aug 3 2022, 17:39:45) [GCC 12.1.1 20220730].
Guessit version: 3.4.3.
/usr/bin/python3
Loading config file /home/sdfg/sickbeard_mp4_automator/config/autoProcess.ini.
Processing file Ghost in the Shell 2.0 (2008) [imdb-tt1260502][Bluray-1080p][DTS-ES 6.1][x264]-MOOVEE.mkv
Unable to generate options, unexpected exception occurred.
Traceback (most recent call last):
File "/home/sdfg/sickbeard_mp4_automator/resources/mediaprocessor.py", line 134, in process
options, preopts, postopts, ripsubopts, downloaded_subs = self.generateOptions(inputfile, info=info, original=original, tagdata=tagdata)
File "/home/sdfg/sickbeard_mp4_automator/resources/mediaprocessor.py", line 662, in generateOptions
awl, swl = self.safeLanguage(info, tagdata.tmdbid, tagdata.mediatype)
AttributeError: 'NoneType' object has no attribute 'tmdbid'
There was an error processing file Ghost in the Shell 2.0 (2008) [imdb-tt1260502][Bluray-1080p][DTS-ES 6.1][x264]-MOOVEE.mkv, no output data received
✓ 0 [843ms]
Figured out it was because tag = False
, but might need a catch for that. Continuing tests...
EDIT: I realise why, looking at the code. Need a way to fetch original-language
from TMDB without tagging the file with the rest of the metadata.
How do I enable debug mode? I see a lot of self.log.debug
in the code but don't see how to enable it to check stuff.
I've spent a lot of time adding the new config options and trying to figure out the combinations to use, and I have a few suggestions.
Separate out the languages from the technical codec stuff. Streamline the language/disposition options.
[Audio.Tracks]
languages = original,jpn,fre,ita,default,nor
dispositions = default,dub,comment
maximum-audio-tracks-per-language = 1
maximum-audio-tracks-total = 3
at-least-one-audio-track = True
languages
and dispositions
are ordered. Then you can do a loop like this (it's been a very long time since I've done any pseudocode, please be kind!)
for each language in languages
for each disposition in dispositions
if exists language.disposition
add language.disposition to track list
current-language-track-count++
current-total-track-count++
if current-language-track-count == maximum-audio-tracks-per-language
break
if current-total-track-count == maximum-audio-tracks-total
break
if current-total-track-count == 0 && at-least-one-audio-track
add first language.disposition in original file
By my reckoning you could then get rid of these:
[Audio]
include-original-language
- covered by languages
first-stream-of-language
- covered by maximum-tracks-per-language
allow-language-relax
- covered by languages
and at-least-one-track
relax-to-default
- covered by languages
ignored-dispositions
- covered by dispositions
force-default
- covered by languages
unique-dispositions
- covered
Similar kind of thing for subtitles - technical stuff (burning, embedding, codec etc) in the main bit, then:
[Subtitle.Tracks]
languages = original,jpn,fre,ita,default,nor
dispositions = forced,default,hearing_impaired,comment
maximum-subtitle-tracks-per-language = 2
maximum-subtitle-tracks-total = 2
ignore-embedded-subs = False
Use a similar loop as above (except including a search in subliminal for each combination as well)
for each language in languages
for each disposition in dispositions
if exists language.disposition && !ignored-embedded-subs
add language.disposition to track list
current-language-track-count++
current-total-track-count++
else
subliminal search for language.disposition
if downloaded language.disposition
add language.disposition to track list
current-language-track-count++
current-total-track-count++
if current-language-track-count == maximum-subtitle-tracks-per-language
break
if current-total-track-count == maximum-subtitle-tracks-total
break
Then you can get rid of
[Subtitle]
default-language
include-original-language
first-stream-of-language
ignored-dispositions
force-default
unique-dispositions
[Subtitle.Subliminal]
download-forced-subs
include-hearing-impared-subs
I just think there are a lot of true/false options that can be difficult to follow when in combination with each other (forced,default,relax etc). With these changes a user knows at a glance what the priorities and limits are without having to create a semantic tree to figure it out!
https://github.com/mdhiggins/sickbeard_mp4_automator/commit/8fb1022d5945c7f0ec26d083737d5669d0ed9f75 https://github.com/mdhiggins/sickbeard_mp4_automator/commit/54046aa3702271da3fed0268d48c506872dce6b1 https://github.com/mdhiggins/sickbeard_mp4_automator/commit/2046664da5c48371ba52e84f6d0d6f5ed77a62c2
Fixes the error from your last post
Reworking all the audio and subtitle options would be a big undertaking, challenging to maintain backwards compatibility, and I think personally a disposition whitelist approach is not a great one.
Disposition and language data is very often lacking and unreliable, inconsistently implemented across different containers (quick example is that mp4 containers don't even store a 'forced' flag), and not really consistent with what I've found most users over the years are looking to do with this automation step. Lots of media will have no positive disposition flags or will just inappropriately flag all dispositions as default. From my experience most users are looking to preserve what is there and only explicitly eliminate what they know they don't want when taking an approach to media automation. I do think I can probably eliminate some of the options added by your recent feature requests (relax to default being the first one to drop) to make things clearer but I probably would not look to rewrite the whole settings approach unless there was a compelling reason.
Debug logging is covered in the wiki
Apologies about the debugging, I could have sworn I checked the wiki. My bad!
Sure, if the compatibility and work involved is too much I completely get that. But garbage in, garbage out will happen no matter the processing method. If anything, that's at the crux of what I've been asking for (without realising it) - external validation of what was originally intended (tmdb language lookup, forced subs). I was just trying to offer a more general, agnostic approach that can handle what is and isn't there in any given source media.
There's one thing I can't seem to figure out if it's possible or not, and that's to only fetch full subtitles if there is not a matching language audio stream. i.e I've filtered all the audio to just the original audio as tmdb provides, it isn't English, in that case I want full English subtitles. But if the original audio is English, then I only want to check for forced subtitles.
Edit: Bazarr has this option, called "Exclude Audio" (terrible name for it).
https://github.com/mdhiggins/sickbeard_mp4_automator/commit/d3f8b7a01a4e28f84f4d09f732bccd21732b908e
Take a look at that
Removed some legacy options which I felt weren't needed
Added a new dynamic-download option which will set subtitle downloading preferences based on original language when compared against your set default language
Also included 'original-language' as a valid parameter for the sorting function
I think I mis-spoke when I said 'fetch'. I didn't specifically mean 'download' but 'obtain', whether or not that is from the existing file or from an external source. My test media (eng audio) already has eng-forced and eng subtitles embedded. Both get added to the resulting file when I only want the forced ones.
Killing me
I'm going to say this needs to move to the custom functions then too niche of a request
I went ahead and included the tagdata object as a parameter in the skipStream and validation custom methods (though tagdata will not always be present on the validation call depending on what script is calling it) so that you can have easy access to the original language
Ah, I thought there was a flow what subtitles do I need -> what subtitles do I have -> what subtitles do I need to download
. I was just suggesting moving the dynamic-download logic from what subtitles do I need to download
to what subtitles do I need
.
But sure, having the tags there is super useful. It should be trivial to add the logic necessary to an external script now. Did you just remove dynamic-download entirely, or am I misreading the diffs?
Yeah just nuked it entirely. It was half baked anyway. Similar functionality should be doable via custom functions
Sure, just more inefficiently. Now I'm going to have to ensure that full subtitles are always present, and then do a removal pass if they don't match. At least with dynamic-download sma wouldn't download extra subs it definitely knew I didn't want.
Before:
file (english audio) -> known 'native' audio track -> post process to remove full subs
Now
file (english audio) -> unknown audio track -> get full subs -> post process to remove full subs
You can change settings on the fly exactly the same way the proposed dynamic download feature was implemented in your custom function
The only thing it was doing was setting which type of subtitle to download
self.settings.downloadforcedsubs = (self.settings.adl == original_language)
self.settings.downloadsubs = (self.settings.adl != original_language)
which from any of the custom functions can be set after checking if tagdata is available
mp.settings.downloadforcedsubs = (mp.settings.adl == tagdata.original_language)
mp.settings.downloadsubs = (mp.settings.adl != tagdata.original_language)
Plus you can sweep the info object and see if you want to disable downloading entirely because it has embedded subs that fit your need
Yeah, I misunderstood how you were doing custom functions until I started going through the wiki and the examples - I thought you were just passing environment variables to external scripts. I was expecting my script to get the file name or other environmental data, then for me to ffprobe, parse that data, and then ffmpeg myself to remove unnecessary subtitles.
I've got very little experience with python, and that was ten years ago, so I haven't a clue how to write a custom function for this. My best option is to handle it in the external script I need to write, given I have to pass the resulting file to another program afterwards anyway.
Thanks for your help and patience. I've got it working just how I want with 6 lines of bash :-)
Should share it in case others have the same issue
I threw this together as a quick pass to get you started if you ever wanted a more integrated solution
def skipStream(mp, stream, info, path, tagdata):
mp.log.info("Initiating custom stream skip check method.")
if tagdata:
foreign_language = tagdata.original_language != mp.settings.adl and any(a for a in info.audio if a.metadata.get('language') == tagdata.original_language and mp.validDisposition(a, mp.settings.ignored_audio_dispositions))
mp.settings.downloadsubs = foreign_language and not any(s for s in info.subtitle if not s.disposition.get('forced') and not s.disposition.get('comment') and mp.validDisposition(s, mp.settings.ignored_subtitle_dispositions))
mp.settings.downloadforcedsubs = not foreign_language and not any(s for s in info.subtitle if s.disposition.get('forced') and mp.validDisposition(s, mp.settings.ignored_subtitle_dispositions))
if foreign_language and stream.type == "subtitle":
return stream.disposition.get("forced")
elif not foreign_language and stream.type == "subtitle":
return not stream.disposition.get("forced")
return False
Should share it in case others have the same issue
Yeah, I just wanted to do more testing on it. There are probably edge cases where it won't work.
For anyone reading this: DO NOT USE IT. It's for guidance to write your own only, and has hard coded preferences for me. NEVER USE RANDOM SCRIPTS YOU FIND ON THE INTERNET UNLESS YOU UNDERSTAND THEM.
BASETMPDIR="/store/.transcode/video"
if [ "$radarr_eventtype" = "Test" ] || [ "$sonarr_eventtype" = "Test" ]; then
exit 0
elif [ -n "$radarr_moviefile_path" ]; then
INPUTFILE=$radarr_moviefile_path
ID="-tmdb $radarr_movie_tmdbid "
elif [ -n "$sonarr_episodefile_path" ]; then
INPUTFILE=$sonarr_episodefile_path
ID="-tvdb $sonarr_series_tvdbid "
elif [ -n "$1" ]; then
if [ "${1::1}" != "/" ]; then
INPUTFILE="$(pwd)/$1"
else
INPUTFILE=$1
fi
fi
mkdir -p "$BASETMPDIR"
fullfile=$(basename "${INPUTFILE}")
filename=$(basename "${INPUTFILE%.*}")
filetype=${INPUTFILE##*.}
tmpdir="$BASETMPDIR/$filename"
mkdir -p "$tmpdir"
cp "$INPUTFILE" "$tmpdir"
infile="$tmpdir/$fullfile"
smafile="$tmpdir/$filename.mkv"
subsfile="$tmpdir/$filename.mkv.subs"
normfile="$tmpdir/$filename.mkv.norm"
# sma
if ([ "$filetype" = "mkv" ] && [ ! -f "$smafile.original" ]) || ([ "$filetype" != "mkv" ] && [ ! -f "$smafile" ]); then
/store/.bin/sma/manual.py "$ID"-a -i "$infile" || rm -f "$smafile" "$infile" "$smafile.original"
fi
# post-process sma
if [ ! -f "$subsfile" ]; then
eval $(ffprobe -v 0 -show_entries stream=index:stream_tags=language,title -select_streams a -of flat=s=_ "$smafile")
audio_lang=$streams_stream_0_tags_language
eval $(ffprobe -v 0 -show_entries stream=index:stream_tags=language,title -select_streams s:0 -of flat=s=_ "$smafile")
if [ "$audio_lang" = "eng" ]; then
if [ "$streams_stream_0_tags_title" = "Forced" ]; then
ffmpeg -v 0 -err_detect ignore_err -fflags +igndts -f matroska -i "$smafile" -c:v copy -c:a copy -c:s copy -map 0:v:0 -map 0:a:0 -map 0:s:0 -f matroska "$subsfile" || rm -f "$subsfile"
else
ffmpeg -v 0 -err_detect ignore_err -fflags +igndts -f matroska -i "$smafile" -c:v copy -c:a copy -c:s copy -map 0:v:0 -map 0:a:0 -f matroska "$subsfile" || rm -f "$subsfile"
fi
else
ffmpeg -v 0 -err_detect ignore_err -fflags +igndts -f matroska -i "$smafile" -c:v copy -c:a copy -c:s copy -map 0:v:0 -map 0:a:0 -map 0:s:0 -f matroska "$subsfile" || rm -f "$subsfile"
fi
fi
# ffmpeg-normalize
if [ ! -f "$normfile" ]; then
ffmpeg-normalize "$subsfile" -c:a ac3 -pr -nt rms -t -23 -f -of "$tmpdir" -ofmt matroska -ext norm || rm -f "$normfile"
fi
mv -f "$normfile" "$INPUTFILE" || exit 1
rm -r "$tmpdir"
Tweak the forced option so you can download both forced and standard
Now I've got something (vaguely) production ready, I've been testing it more thoroughly on edge cases. I've noticed that the forced download doesn't seem to work. I'm not sure if you'd rather open a new issue for that to keep this clear.
I've attached a very cut down file that I've been using to test. (Just remove .csv, seems github doesn't like mkv). Here is one of a couple of 'foreign parts only' (forced) subtitles that are available. sma doesn't find any forced subtitles.
Is your feature request related to a problem? Please describe. Foreign language films that don't have
eng
audio streams get stripped of all audio.Describe the solution you'd like An option to always keep the
default
disposition. The ideal way would be akeep-dispositions
option analogous to theignore-dispositions
option.