yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
89.47k stars 6.93k forks source link

Live stream TF1.FR not detected (with solution) #1381

Open loby66 opened 3 years ago

loby66 commented 3 years ago

Checklist

Region

France

Example URLs

Description

live stream not detected with https://www.tf1.fr/tf1/direct

Solution in Perl to get the MPD url :

$ua->get("https://mediainfo.tf1.fr/mediainfocombo/L_TF1?context=MYTF1&pver=4001000"); $content=$response->content;

($url)=$content=~/\,\"url\"\:\"([^\"]+.mpd)\"/mi;

Verbose log

yt-dlp -U
yt-dlp is up to date (2021.10.10)

yt-dlp -Uv "https://www.tf1.fr/tf1/direct"
[debug] Command-line config: ['-Uv', 'https://www.tf1.fr/tf1/direct']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] yt-dlp version 2021.10.10
[debug] Python version 3.7.9 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 2021-09-16-git-8f92a1862a-essentials_build-www.gyan.dev, ffprobe 2021-09-16-git-8f92a1862a-essentials_build-www.gyan.dev, rtmpdump 2.4
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] ANSI escape support: stdout = True, stderr = True
[debug] Proxy map: {}
yt-dlp is up to date (2021.10.10)
[debug] [generic] Extracting URL: https://www.tf1.fr/tf1/direct
[generic] direct: Requesting header
WARNING: [generic] Falling back on generic information extractor.
[generic] direct: Downloading webpage
[generic] direct: Extracting information
ERROR: [generic] Unsupported URL: https://www.tf1.fr/tf1/direct
Traceback (most recent call last):
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\extractor\common.py", line 589, in extract
    ie_result = self._real_extract(url)
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\extractor\generic.py", line 3706, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.tf1.fr/tf1/direct
Traceback (most recent call last):
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\extractor\common.py", line 589, in extract
    ie_result = self._real_extract(url)
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\extractor\generic.py", line 3706, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.tf1.fr/tf1/direct

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\YoutubeDL.py", line 1263, in wrapper
    return func(self, *args, **kwargs)
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\YoutubeDL.py", line 1288, in __extract_info
    ie_result = ie.extract(url)
  File "C:\Users\mezodia\AppData\Roaming\Python\Python37\site-packages\yt_dlp\extractor\common.py", line 606, in extract
    e.msg, video_id=video_id, ie=self.IE_NAME, tb=e.traceback, expected=e.expected, cause=e.cause)
yt_dlp.utils.ExtractorError: [generic] Unsupported URL: https://www.tf1.fr/tf1/direct
loby66 commented 3 years ago

the url is working without an account needed.

$ua->get("https://mediainfo.tf1.fr/mediainfocombo/L_TF1?context=MYTF1&pver=4001000"); $content=$response->content;

($url)=$content=~/,"url":"([^\"]+.mpd)"/mi;

pukkandan commented 3 years ago

image

pukkandan commented 3 years ago

and the API gives geo-blocked. So I am not sure if it is just geo-blocking or actually account is needed

{"media":{"id":"L_TF1","type":"live","error_code":"GEOBLOCKED","error_desc":"Ce contenu n'est pas disponible dans votre zone géographique","title":"Ici tout commence","useND":"//delivery.tf1.fr/mytf1-wrd/L_TF1","geoList":["FR","AD","FX","GF","GP","MC","MQ","NC","PF","PM","RE","WF","YT","TF"],"preview":"https://photos.tf1.fr/1280/720/vignette-16-9-ici-tout-commence-la-scene-ee8985-cb438a-0@1x.jpg","channel":"tf1","channel2":"TF1","liveTS":1634920890,"geolock":true,"emId":1182159,"endDate":"2021-10-22T17:14:28Z","syndicable":false,"shortTitle":"Episode 255","programName":"Ici tout commence"},"content":{"title":"episode-255","emId":"1182159"},"mediametrie":{"chapters":[{"title":"null_null_episode-255","estatTitle":"null_null_episode-255","estatS1":"MYTF1","estatS2":"ici-tout-commence_TF1","estatS3":"live","estatS4":"2021-10-22_18:41:30","estatS5":"0_1182159","estatGenre":"desktop_browsing_windows_html5"}],"id":"258058214978","mediaId":"L_TF1","estatMsDm":"LIVE","estatMsCh":"1","estatMsCid":"-"},"richmedia":{"site":553584,"mediaType":"video","mediaLevel2":103,"mediaLabel":{"label":"int::[CONTENTTYPE]::ici-tout-commence_TF1::null_null_episode-255","contentType":"live"},"refreshDuration":30,"duration":-1,"isEmbedded":false,"broadcastMode":"live","webdomain":""},"fw":{"adManagerSwf":"https://mssl.fwmrm.net/p/tf1_flash_live/AdManager.swf","id":506334,"asset":"L_TF1","profile":"506334:tf1_html5_live_mills","jingleOutPath":"https://tf1voddnlssldbd.akamaized.net/2/USP-0x0/00/24/13560024/ssm/13560024-1200-64k.mp4","url":"https://7b9de.v.fwmrm.net","channel":"tf1","page":"sites","section":"tf1_page_type_sites_platform_desktop_media_desktop","duration":-1},"yb":{"accountCode":"tf1","username":"ici-tout-commence_TF1","transactionCode":"MYTF1","media":{"title":"episode-255","duration":-1,"isLive":true},"properties":{"content_id":"L_TF1","genre":"desktop_browsing_windows_html5","language":"FR","year":"2021-10-22_18:41:30","owner":"ici-tout-commence_TF1"},"extraParams":{"param1":"MYTF1","param2":"TF1","param3":"DASH","param4":"HTML5","param5":"DESKTOP","param6":"NO"},"ads":{"campaign":"506334:tf1_html5_live_mills:tf1_page_type_sites_platform_desktop_media_desktop"},"enableAnalytics":true},"streamroot":{"streamrootKey":"c1335d96-4e28-4b42-8065-9a5a8267d421","backendUrl":"https://distribsr.tf1.fr","backendHost":"https://strmt.tf1.fr"},"dai":{"assetKey":"VxaW5bQCRS63ujrg91srqw"},"delivery":{"code":403,"error":"Désolé, cette vidéo n'est pas accessible depuis votre zone géographique","country":"IN","format":"dash"}}
ghost commented 2 years ago

I plan on fixing some of the issues related to TF1.FR and I found that the extractor for WAT.TV, with some minor modifications, works perfectly to get the live feed. WAT has also been shutdown in favor of MY TF1. Should I add a new InfoExtractor in tf1.py and make the necessary changes to remove all references to WAT.TV from the code?

pukkandan commented 2 years ago

Yes, and you can delete wat.py if the service is shutdown

ghost commented 2 years ago

So I've worked on this for a bit and it turns out that whenever someone is geo-blocked, the geoList returned by https://mediainfo.tf1.fr/mediainfocombo/L_TF1?context=MYTF1 contains FX which is an "exceptionally reserved" country code. It is still part of the ISO3166 standard as you can see here: Country Codes Collection, more specifically ISO3166-3 which is a list of country codes that have been removed from ISO3166-1. In this case, FX was merged into the more commonly used FR.

The country map in ISO3166Utils only takes into account officially assigned code elements. Whenever I invoke raise_geo_restricted with the aforementioned geoList, I get the following error: TypeError: sequence item 2: expected str instance, NoneType found

Should the country map be extended to include codes from the ISO3166-3 standard or should I just manually replace FX with FR for this case as it is the only reserved code in geoList (at least for now)?

pukkandan commented 2 years ago

You can extend the list

ghost commented 2 years ago

Turns out some ISO3166-1 and ISO3166-3 alpha-2 codes overlap but can refer to a completely different territory (e.g. SK used to refer to Sikkim and is now used for Slovakia). I was thinking of using two maps instead of one. The second one is a historic country map that acts as a fallback lookup table for whenever the code is not found in _country_map.

@classmethod
def short2full(cls, code):
    """Convert an ISO 3166-2 country code to the corresponding full name"""
    return cls._country_map.get(code.upper()) or cls._historic_country_map.get(code.upper())

For now I'm getting the iso codes from pycountry since it's surprisingly difficult to find a reliable source for ISO3166-3 codes.

pukkandan commented 2 years ago

The second one is a historic country map that acts as a fallback lookup table for whenever the code is not found in _country_map.

That is effectively the same adding codes to the dict iff they have no conflict with existing codes

ghost commented 2 years ago

Indeed, I just wanted to create one map for each part of the ISO3166 standard (ISO3166-1 and ISO3166-3) so that it is easier to tell which one conforms to which standard for future developers. I guess I will just add a comment to clarify and I will merge both by giving priority to alpha-2 codes from ISO3166-1. I appreciate the quick responses!

Totorrr commented 2 years ago

Hello @MarwenDallel, and thank you for working on this. I was about to open the issue myself when I saw yours. :)

Would you please include the TMC live stream also (https://www.tf1.fr/tmc/direct) in your work? I suppose, because it is from the same tf1.fr website, that it can be included pretty easily.

Thanks!

ghost commented 2 years ago

Hey @Totorrr, I've looked into it and added support for all live streams (tf1, tfx, tmc, lci and tf1 series films). My only issue is that FFmpeg is acting up and throwing several errors. It starts by throwing errors related to the stream_index image Followed by DTS errors once it gets past the first few seconds of the live stream. image

I have tested several versions of FFmpeg (4.2.1, 5.0.1 and n5.0.1-4) with the same results. I have also tried different parameters (including -copytb 1 and -fflags +igndts) with no success. This happens even if I download the audio only or the video only and results in a corrupt (but playable) output file.

You can reproduce the error by calling FFmpeg with the MPD file returned by https://mediainfo.tf1.fr/mediainfocombo/L_TF1?context=MYTF1.

Since this issue seems to be unrelated to yt-dlp, I am not sure if I should make a PR or not.

Edit: I have now tested with streamlink and ffplay and the live works as intended. As long as Streamlink isn't supported as an external downloader or FFmpeg isn't fixed, I don't think there is much we can do. We can maybe add a warning similar to what has been done in abematv.py

if is_live:
    self.report_warning("This is a livestream; yt-dlp doesn't support downloading natively, but FFmpeg cannot handle m3u8 manifests from AbemaTV")
    self.report_warning('Please consider using Streamlink to download these streams (https://github.com/streamlink/streamlink)')