aajanki / yle-dl

Download videos from Yle servers
https://aajanki.github.io/yle-dl/index-en.html
GNU General Public License v3.0
308 stars 51 forks source link

Subtitles missing #252

Open lsolin opened 3 years ago

lsolin commented 3 years ago

yle-dl 20200807, ffmpeg version 4.3-2

I'm trying to download https://areena.yle.fi/1-4455996 but I get no subtitles. Subtitles work in areena player and also --showmetadata finds them.

aajanki commented 3 years ago

There seems to be a problem with subtitles on a small proportion of Areena programs. Even though the API returns subtitles information for them (as shown by --showmetadata), the subtitles aren't embedded in the video stream as is the case with most of the videos.

A possible work-around on the latest master branch (not yet released) is to switch to the wget downloader with "--backend wget". It tries to download subtitles from an alternative location as a separate file. I can't test it with your example video, because it has already expired.

jkjuopperi commented 3 years ago

Getting subtitles also fails with this one: https://areena.yle.fi/1-4605037 Version yle-dl 20200807

[webvtt @ 0x10edb00] Dropping 10 duplicated subtitle events
[NULL @ 0x11b4940] Unable to find a suitable output format for 'lihottavat'
lihottavat: Invalid argument
Output file: Prisma: Totuus hiilihydraateista-2020-07-27T06:00.mp4
WARNING: The wget backend might not be able to download subtitles, try --backend=ffmpeg

With --backend=ffmpeg the error is:

[webvtt @ 0xd58b00] Dropping 10 duplicated subtitle events
[NULL @ 0xe1f940] Unable to find a suitable output format for 'lihottavat'
lihottavat: Invalid argument

Subtitle related parts from the --showmetadata output:

    "embedded_subtitles": [
      {
        "language": "fin",
        "category": "k\u00e4\u00e4nn\u00f6stekstitys"
      }
    ],
    "subtitles": [
      {
        "language": "fin",
        "url": "https://cdnsecakmi.kaltura.com/api_v3/index.php/service/caption_captionAsset/action/serve/captionAssetId/1_sk8onesb/ks/ZDM2ZjFhNWMzMTZlOTNkZTIzNjMwMTA1YzdkNjlmODdmYTIzYzJkZnwxOTU1MDMxOzE5NTUwMzE7MTYwMjQ0Mzc1OTswOzIwNzU7b3ZwQHlsZS5maTtkb3dubG9hZDoxX3Bvajk5Mmo1",
        "category": "k\u00e4\u00e4nn\u00f6stekstitys"
      }
    ],
lsolin commented 3 years ago

I get subtitles with both backends. I don't understand where the 'lihottavat' error in your output comes from?

jkjuopperi commented 3 years ago

ffmpeg version is 4.1. The problem seems to be related to ffmpeg not parsing the command line properly.

DEBUG: ffmpeg -y -loglevel warning -thread_queue_size 512 -strict experimental -stats -i https://cdnsecakmi.kaltura.com/p/1955031/sp/195503100/playManifest/entryId/1_poj992j5/flavorId/0_iunvqr9f/format/applehttp/protocol/https/a.m3u8?uiConfId=43362851&referrer=aHR0cHM6Ly9hcmVlbmEueWxlLmZpLzEtNDYwNTAzNw==&playSessionId=11111111-1111-1111-1111-111111111111&clientTag=html5:v0.39.4 -metadata description=Hiilihydraatit lihottavat ja aiheuttavat elintasosairauksia. Onko totuus kuitenkaan näin mustavalkoinen? Tohtori Xand Van Tullekenin mukaan paras tapa pysyä terveenä ja saavuttaa ihannepaino on heittää uskomukset romukoppaan ja keskittyä laatuun ja määrään - myös hiilihydraateissa. Löytyykö väitteelle tieteellistä näyttöä? T: Lion TV/BBC, Iso-Britannia -metadata creation_time=2020-07-27T06:00:00+03:00 -bsf:a aac_adtstoasc -vcodec copy -acodec copy -map 0:p:0 -dn -scodec srt file:Prisma: Totuus hiilihydraateista-2020-07-27T06:00.mkv
jkjuopperi commented 3 years ago

Even clearly separating '-metadata' 'description=foo bar' will cause:

[NULL @ 0x7aea80] Unable to find a suitable output format for 'bar'
bar: Invalid argument

Which version of ffmpeg does it work with?

lsolin commented 3 years ago

ffmpeg 4.3-2

Works also with wget.

aajanki commented 3 years ago

Works for me on ffmpeg 4.1.6 on Debian stable.

@jkjuopperi Maybe I can figure out something if you can send the output of the following command (or put it in pastebin.com if it's very long):

yle-dl --backend ffmpeg --verbose https://areena.yle.fi/1-4605037