Open Kha-kis opened 5 days ago
My first silly thought is to refactor the mi meta, so that it's consistent with mediainfo.txt. For instance, I couldn't understand why the language here https://github.com/Audionut/Upload-Assistant/commit/ca5b3d773c53f59983d696cd504ff8745214087b seemingly started referring to the 2 character designation (fixed properly here https://github.com/Audionut/Upload-Assistant/commit/b7bfcf1e0c5ed2702c4998efc9d7223a96e84563), and so I just referred to txt instead of mi meta as the fix. See /forums/topics/1349/posts/26436 at ATH
That would be no small change though.
1st issue with 1st solution seems fine. If there's some other en variant, it can be easily added without the worry of false positives.
I need to spend some more time with the second issue. It's seems logical that if the meta is marked original, then it's original, period, and I don't understand why it needs the additional language check when the English check seems to catch that already.
It seems like the first original check is probably what's triggering the issue #51 bug, and rather than adding a gazillion non-english variants, it's probably best to just use is not english_variants
I have been doing further testing. Reviewing the language codes I was unable to identify any chances for duplicates if using starts with. https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes
using that logic I was able to rewrite the dubbed and dual-audio sections.
Please let me know if this fits your logic stream and if so can we commit this?
if meta.get('original_language', '') != 'en':
eng, orig = False, False
try:
for t in mi.get('media', {}).get('track', []):
if t.get('@type') != "Audio":
continue
audio_language = t.get('Language', '')
# Check for English Language Track
if audio_language.startswith("en") and "commentary" not in t.get('Title', '').lower():
eng = True
# Check for original Language Track (non-English) with region tag flexibility
if not audio_language.startswith("en") and audio_language.startswith(meta['original_language']) and "commentary" not in t.get('Title', '').lower():
orig = True
# Catch Chinese / Norwegian variants with region tag flexibility
variants = ['zh', 'cn', 'cmn', 'no', 'nb']
if any(audio_language.startswith(var) for var in variants) and any(meta['original_language'].startswith(var) for var in variants):
orig = True
# Check for additional, potentially bloated tracks
if audio_language != meta['original_language'] and not audio_language.startswith("en"):
# If audio_language is empty, set to 'und' (undefined)
audio_language = "und" if audio_language == "" else audio_language
console.print(f"[bold red]This release has a(n) {audio_language} audio track, and may be considered bloated")
time.sleep(5)
print(f"eng: {eng}, orig: {orig}")
# Determine if the release is Dual-Audio or Dubbed
if eng and orig:
dual = "Dual-Audio"
elif eng and not orig and meta['original_language'] not in ['zxx', 'xx', None] and not meta.get('no_dub', False):
dual = "Dubbed"
except Exception:
console.print(traceback.format_exc())
pass
That looks good at first glance, I can't recall what file I had was triggering the dual-audio bug. @backstab5983 do you have a filename handy that triggers this bug?
Did you forget the english_variants, and the other variants you added @Kha-kis or are they not needed any longer?
They are no longer needed as I am using if audio_language.startswith("en")
and audio_language.startswith(meta['original_language'])
the only variants needed are Chinese and Norwegian as they have multiple ISO codes.
There is an edge case where dual audio can not be identified if original audio is incorrect on tmdb (https://www.themoviedb.org/tv/110382-pachinko for example should be Korean). There is no current way to name correctly.
In order to remediate this issue, The following changes can be made.
in args.py add the additional arg of --dual-audio
parser.add_argument('--dual-audio', dest='dual_audio', action='store_true', required=False, help="Add Dual-Audio to the title")
in upload.py update the overwrite_list
overwrite_list = [
'trackers', 'dupe', 'debug', 'anon', 'category', 'type', 'screens', 'nohash', 'manual_edition', 'imdb', 'tmdb_manual', 'mal', 'manu> 'hdb', 'ptp', 'blu', 'no_season', 'no_aka', 'no_year', 'no_dub', 'no_tag', 'no_seed', 'client', 'desclink', 'descfile', 'desc', 'dr><host', 'manual_source', 'webdv', 'hardcoded-subs', 'dual_audio' ]
finally, in prep.py update the dual logic.
if meta.get('dual_audio', False): # If dual_audio flag is set, skip other checks
dual = "Dual-Audio"
else:
if meta.get('original_language', '') != 'en':
eng, orig = False, False
try:
for t in mi.get('media', {}).get('track', []):
if t.get('@type') != "Audio":
continue
audio_language = t.get('Language', '')
# Check for English Language Track
if audio_language.startswith("en") and "commentary" not in t.get('Title', '').lower():
eng = True
# Check for original Language Track
if not audio_language.startswith("en") and audio_language.startswith(meta['original_language']) and "commentary" not in t.get('> orig = True
# Catch Chinese / Norwegian Variants
variants = ['zh', 'cn', 'cmn', 'no', 'nb']
if any(audio_language.startswith(var) for var in variants) and any(meta['original_language'].startswith(var) for var in variant> orig = True
# Check for additional, bloated Tracks
if audio_language != meta['original_language'] and not audio_language.startswith("en"):
# If audio_language is empty, set to 'und' (undefined)
audio_language = "und" if audio_language == "" else audio_language
console.print(f"[bold red]This release has a(n) {audio_language} audio track, and may be considered bloated")
time.sleep(5)
if eng and orig:
dual = "Dual-Audio"
elif eng and not orig and meta['original_language'] not in ['zxx', 'xx', None] and not meta.get('no_dub', False):
dual = "Dubbed"
except Exception:
console.print(traceback.format_exc())
pass
Please perform any testing needed and if all is well I will submit a pr.
Apologies for the delay, it all seems to be working fine here. Not sure what happened with your copy/paste,
if meta.get('dual_audio', False): # If dual_audio flag is set, skip other checks
dual = "Dual-Audio"
else:
if meta.get('original_language', '') != 'en':
eng, orig = False, False
try:
for t in mi.get('media', {}).get('track', []):
if t.get('@type') != "Audio":
continue
audio_language = t.get('Language', '')
# Check for English Language Track
if audio_language.startswith("en") and "commentary" not in t.get('Title', '').lower():
eng = True
# Check for original Language Track
if not audio_language.startswith("en") and audio_language.startswith(meta['original_language']) and "commentary" not in t.get('Title', '').lower():
orig = True
# Catch Chinese / Norwegian Variants
variants = ['zh', 'cn', 'cmn', 'no', 'nb']
if any(audio_language.startswith(var) for var in variants) and any(meta['original_language'].startswith(var) for var in variants):
orig = True
# Check for additional, bloated Tracks
if audio_language != meta['original_language'] and not audio_language.startswith("en"):
# If audio_language is empty, set to 'und' (undefined)
audio_language = "und" if audio_language == "" else audio_language
console.print(f"[bold red]This release has a(n) {audio_language} audio track, and may be considered bloated")
time.sleep(5)
if eng and orig:
dual = "Dual-Audio"
elif eng and not orig and meta['original_language'] not in ['zxx', 'xx', None] and not meta.get('no_dub', False):
dual = "Dubbed"
except Exception:
console.print(traceback.format_exc())
pass
I have identified issues with dual-audio identification
the 1st issue I have run into is related to English identification in prep.py
The orginal code is:
in order to debug I added the following
print(f"eng: {eng}, orig: {orig}")
For a file with an en-US audio track this was the output
eng: False, orig: True
I added the following code in order to remediate:
Another solution could be to use
However I am not certian on ammount of false positives if any could occur.
the 2nd issue is almost the same as the 1st however it is in relation to orig.
This is where the orig audio has a correct region specified.
as a temp work arround for the languages I have worked with I have added them under the variants section:
However there should be an easier identification method I have not solved for yet following