MediaBrowser / Emby

Emby Server is a personal media server with apps on just about every device.
https://emby.media
GNU General Public License v2.0
4.2k stars 811 forks source link

[Feature Request] Ability to select subtitle encoding #2255

Open osrl opened 8 years ago

osrl commented 8 years ago

Ability to select subtitle encoding would be nice. Maybe a default encoding setting in settings or edit subtitle screen.

LukePulverenti commented 7 years ago

Is this for other applications to use the subtitle files? because at this point there should no longer be any issues inside Emby related to character encoding. Thanks !

osrl commented 7 years ago

No. I still get encoding errors on 3.1.2. I manually convert encoding to utf-8. You said (on the forum) it's fixed on next version. Maybe it's not released yet?

LukePulverenti commented 6 years ago

Are you still having any encoding issues?

osrl commented 6 years ago

I've checked last month and even manually converting the encoding didn't work.

LukePulverenti commented 6 years ago

What do you mean by that?

osrl commented 6 years ago

I was using 'iconv -f ISO-8859-9 -t UTF-8 in.srt > out.srt' to change the encoding. This way I could watch with correct encoding. But this doesn't work anymore. Characters are still looks wrong.

LukePulverenti commented 5 years ago

Heads up, the newly released Emby Server 4.0 will have much improved encoding detection of subtitle files. Please try it out. Thanks !

LukePulverenti commented 5 years ago

Has anyone tried their subtitles with 4.0? The encoding detection should be very much improved so that a setting isn't really needed anymore.

osrl commented 5 years ago

Hello, I did. But it was still the same :( sorry. You can try with any of these subtitles: https://subscene.com/subtitles/the-walking-dead-eighth-season/turkish

theerror commented 5 years ago

I have to second this! From my experience in czech encoding, around 99% are using the windows-1250 encoding (known as CP-1250 too). Not sure about others, but I think, that same would apply for other languages too. So selecting some default one, would handle most of problems.

Can selecting of possible inputs codings somehow help in detection of coding? If so, then it would be fine to let user enter the codings by himself or fallback to default encoding detection based on language.

As far, as I can say from my experience - the downloaded subtitles break whole playing of movie. :(

2019-02-23 14:40:21.583 Info SubtitleEncoder: ProcessRun 'ffmpeg-subtitle_convert' Execute: /bin/ffmpeg  -sub_charenc windows-1250 -i "/config/metadata/library/e7/e76e91c125e841794479c4febfb24f9b/Tísňové volání (2018).cs.srt" "/config/cache/subtitles/8/8cf43e4ebc12a3ec8b58084b4be2696b_636864862015728459_8_0_0_False.vtt"
2019-02-23 14:40:21.599 Info SubtitleEncoder: ProcessRun 'ffmpeg-subtitle_convert' Started.
2019-02-23 14:40:21.830 Info SubtitleEncoder: ProcessRun 'ffmpeg-subtitle_convert' Process exited with code 1
2019-02-23 14:40:21.830 Error SubtitleEncoder: ffmpeg subtitle conversion failed for /config/metadata/library/e7/e76e91c125e841794479c4febfb24f9b/Tísňové volání (2018).cs.srt
2019-02-23 14:40:21.831 Error SubtitleEncoder: ProcessRun 'ffmpeg-subtitle_convert' Output:

2019-02-23 14:40:21.833 Error SubtitleEncoder: ProcessRun 'ffmpeg-subtitle_convert' Error Output:
    ffmpeg version 4.0.2-emby_2018_12_09 Copyright (c) 2000-2018 the FFmpeg developers
      built with gcc 6.3.0 (crosstool-NG crosstool-ng-1.23.0)
      configuration: --cc=x86_64-pc-linux-gnu-gcc --arch=x86_64 --prefix=/home/embybuilder/Buildbot/x64/ffmpeg-x64/staging --pkg-config=pkg-config --disable-doc --disable-ffplay --disable-vdpau --disable-xlib --enable-fontconfig --enable-gnutls --enable-gpl --enable-iconv --enable-libass --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libwebp --enable-libx264 --enable-libzvbi --enable-version3 --enable-libsmbclient --enable-cuda --enable-cuvid --enable-libmfx --enable-nvenc --enable-vaapi --enable-cross-compile --cross-prefix=x86_64-pc-linux-gnu- --extra-libs='-lexpat -lfreetype -lfribidi -lfontconfig -liconv -lpng -lz -lvorbis -logg -lnettle -lhogweed -lgmp -laddns-samba4 -lasn1util-samba4 -lauthkrb5-samba4 -lCHARSET3-samba4 -lcliauth-samba4 -lcli-cldap-samba4 -lcli-ldap-common-samba4 -lcli-nbt-samba4 -lcli-smb-common-samba4 -lcom_err -lcommon-auth-samba4 -ldbwrap-samba4 -ldcerpc-binding -ldcerpc-samba-samba4 -ldl -lflag-mapping-samba4 -lgenrand-samba4 -lgensec-samba4 -lgse-samba4 -lgssapi_krb5 -llibcli-lsa3-samba4 -llibsmb-samba4 -linterfaces-samba4 -liov-buf-samba4 -lk5crypto -lkrb5 -lkrb5samba-samba4 -lkrb5support -lldb -lldbsamba-samba4 -lmessages-dgm-samba4 -lmessages-util-samba4 -lmsghdr-samba4 -lmsrpc3-samba4 -lndr -lndr-krb5pac -lndr-nbt -lndr-samba-samba4 -lndr-standard -lreplace-samba4 -lsamba-cluster-support-samba4 -lsamba-credentials -lsamba-debug-samba4 -lsamba-errors -lsamba-hostconfig -lsamba-modules-samba4 -lsamba-security-samba4 -lsamba-sockets-samba4 -lsamba-util -lsamba3-util-samba4 -lsamdb -lsamdb-common-samba4 -lsecrets3-samba4 -lserver-id-db-samba4 -lserver-role-samba4 -lsmbconf -lsmbd-shim-samba4 -lsmb-transport-samba4 -lsocket-blocking-samba4 -lsys-rw-samba4 -ltalloc -ltalloc-report-samba4 -ltdb -ltdb-wrap-samba4 -ltevent -ltevent-util -ltime-basic-samba4 -lutil-cmdline-samba4 -lutil-reg-samba4 -lutil-setid-samba4 -lutil-tdb-samba4 -luuid -lwbclient -lwinbind-client-samba4 -ldrm' --target-os=linux --enable-shared --disable-static
      libavutil      56. 14.100 / 56. 14.100
      libavcodec     58. 18.100 / 58. 18.100
      libavformat    58. 12.100 / 58. 12.100
      libavdevice    58.  3.100 / 58.  3.100
      libavfilter     7. 16.100 /  7. 16.100
      libswscale      5.  1.100 /  5.  1.100
      libswresample   3.  1.100 /  3.  1.100
      libpostproc    55.  1.100 / 55.  1.100
    /config/metadata/library/e7/e76e91c125e841794479c4febfb24f9b/Tísňové volání (2018).cs.srt: Invalid data found when processing input

sub source: https://www.opensubtitles.org/cs/subtitleserve/sub/7529283

EDIT: Hups, didn't notice this error before, I posted different and it looks that problem actually was not in detection, but probably subs itself... strange...

theerror commented 5 years ago

OH No. They corrupting subtitles by their reprocessing and adding the advertisement! Oh nice! Thank you them! :/

https://forums.plex.tv/t/subtitle-file-srt-invalid-data-found-when-processing-input/134438/5

LukePulverenti commented 5 years ago

Yea that error is not about encoding, it's that the srt is invalid.

theerror commented 5 years ago

Do yo know, we can somehow get over this? For example VLC do not have such problem. Would be possible to somehow be more failproof?

LukePulverenti commented 5 years ago

Can you zip up some sample subtitle files and attach them here? thanks.

LukePulverenti commented 5 years ago

That question is for both @osrl and @theerror

Thanks guys.

theerror commented 5 years ago

Sure, I would start collecting those shit ones...should I always add those here, or we create some place? It would be awesome to provide user some dialog, that subs are creepy and if he want to send for analyse. Tísňové volání (2018).cs.srt.zip

DeXtmL commented 5 years ago

Seems this is an old issue and never fixed.
For one, Chinese subtitles's encoding are completely wrong, showing unrecognizable roman characters...

LukePulverenti commented 4 years ago

Seems this is an old issue and never fixed. For one, Chinese subtitles's encoding are completely wrong, showing unrecognizable roman characters...

@DeXtmL can you provide an example? Thanks !