Still a bug when download multi-part video at bilibili

lin-calvin commented 2 years ago

Checklist

[x] I'm reporting a broken site support issue
[x] I've verified that I'm running youtube-dl version 2021.12.17
[x] I've checked that all provided URLs are alive and playable in a browser
[x] I've checked that all URLs and arguments with special characters are properly quoted or escaped
[x] I've searched the bugtracker for similar bug reports including closed ones
[x] I've read bugs section in FAQ

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819', '--yes-playlist', '-vvv']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.4 (CPython) - Linux-5.15.32-xanmod1-x86_64-with-glibc2.34
[debug] exe versions: ffmpeg 5.0-3, ffprobe 5.0-3, rtmpdump 2.4
[debug] Proxy map: {}
[BiliBili] 1Jh411d7kd: Downloading webpage
[BiliBili] 1Jh411d7kd: Downloading video info page
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'http://cn-gdfs-dx-bcache-08.bilivideo.com/upgcxcode/88/63/223516388/223516388_nb2-1-80.flv?e=ig8euxZM2rNcNbNz7zUVhoMghzuBhwdEto8g5X10ugNcXBlqNxHxNEVE5XREto8KqJZHUa6m5J0SqE85tZvEuENvNC8xNEVE9EKE9IMvXBvE2ENvNCImNEVEK9GVqJIwqa80WXIekXRE9IMvXBvEuENvNCImNEVEua6m2jIxux0CkF6s2JZv5x0DQJZY2F8SkXKE9IB5QK==&deadline=1656144609&gen=playurl&nbs=1&oi=2005647454&os=bcache&platform=pc&trid=0000f2bcb24363fd404e91c66c64134adeb1&uipk=5&upsig=142525cc1a03a25f95bff0dcc2d30e31&uparams=e,deadline,gen,nbs,oi,os,platform,trid,uipk&mid=0'
[download] Resuming download at byte 937504
[download] Destination: [碧蓝档案Blue Archive] BGM Collections（收录83首）-1Jh411d7kd.flv
[download] 100% of 12.49MiB in 00:53

Description

I want to download a mulit-part video at bilibili but is can only download the frist video

dirkf commented 2 years ago

What did you expect to get apart from the 12.5MB flv?
"Still ..." since when? Was it previously reported (maybe #17376)?

dirkf commented 2 years ago

There are 26 open issues for Bilibili. Even if a few are junk or incorrect matches, the extractor should be unified with the yt-dlp version with the aim of clearing up these issues.

lockmatrix commented 2 years ago

There are 26 open issues for Bilibili. Even if a few are junk or incorrect matches, the extractor should be unified with the yt-dlp version with the aim of clearing up these issues.

Hi, I'm trying to fix bilibili in yt-dlp, https://github.com/lockmatrix/yt-dlp/commits/fix_bilibili2022

maybe it can help you.

but I'm not familiar with the framework, for example: I dont know how to download a mulit-part video and merge them in yt-dlp.

yexing commented 1 year ago

add this code snippet to bilibli.py after line 140

        # fixed multi-part video
        if '?p=' in url:
            cid = self._search_regex(
                r'https.*(\d{9,}).*m4s', webpage, 'cid',
                default=None)

dirkf commented 1 year ago

Please show how that affects extraction, eg with -v -F test_url before and after.

Are both the . meant to be wildcard matches, or is the 2nd meant to match just . (ie should be \.)?

xyzkljl1 commented 1 year ago

add this code snippet to bilibli.py after line 140

        # fixed multi-part video
        if '?p=' in url:
            cid = self._search_regex(
                r'https.*(\d{9,}).*m4s', webpage, 'cid',
                default=None)

Thanks.That works fine.

To use multi-page info in output template. I added following code in line139(based on 17d295a1ec6d04362740dd8a0c583690f5ba082a)

            if '?p=' in url:
                cid = self._search_regex(
                    r'https.*(\d{9,}).*m4s', webpage, 'cid',
                    default=None)
                sub_title = self._search_regex(
                    r'"cid":1{0,1}%s.*?"part":"(.*?[^\\])"' % cid, webpage, 'sub_title',
                    default=None)
                sub_index = self._search_regex(
                    r'"cid":1{0,1}%s.*?"page":(\d+)' % cid, webpage, 'sub_index',
                    default=None)

and add following in line 129

        sub_title=None
        sub_index=None

and add following in line 224

            'sub_title': sub_title,
            'sub_index': sub_index,

dirkf commented 7 months ago

WIP yt-dl extractor:

$ python -m youtube_dl -vF 'https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819'
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-vF', u'https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 4416f82c8
[debug] Python 2.7.15 (CPython i686 32bit) - Linux-6.1.0-18-686-pae-i686-with-debian-12.5 - OpenSSL 1.1.1a  20 Nov 2018 - glibc 2.1.3
[debug] exe versions: ffmpeg 5.1.4-0, ffprobe 5.1.4-0
[debug] Proxy map: {}
[BiliBili] 1Jh411d7kd: Downloading webpage
[BiliBili] BV1Jh411d7kd: Extracting videos in anthology
[BiliBili] 201680039: Extracting chapters
[BiliBili] Format(s) 1080P 高清, 720P 高清 are missing: you have to login or become premium member to download them. Use --cookies to authenticate.
[info] Available formats for BV1Jh411d7kd_p2:
format code  extension  resolution note
30216        m4a        audio only   67k , mp4a.40.2
30232        m4a        audio only  129k , mp4a.40.2
30280        m4a        audio only  129k , mp4a.40.2
100022       mp4        640x360      76k , av01.0.01m.08.0.110.01.01.01.0, 30.019fps, video only
30011        mp4        640x360      92k , hev1.1.6.l120.90, 30.303fps, video only
30016        mp4        640x360     157k , avc1.64001e, 29.412fps, video only
100023       mp4        852x480      89k , av01.0.04m.08.0.110.01.01.01.0, 30.019fps, video only
30033        mp4        852x480     126k , hev1.1.6.l120.90, 30.303fps, video only
30032        mp4        852x480     157k , avc1.64001f, 29.412fps, video only (best)

Compare yt-dlp updated with the latest extractor code:

```console $ yt-dlp -vF 'https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819' [debug] Command-line config: ['-vF', 'https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819'] [debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version stable@2023.06.22 [812cdfa06] (source) [debug] Lazy loading extractors is disabled [debug] Git HEAD: de4cf77ec [debug] Python 3.11.2 (CPython i686 32bit) - Linux-6.1.0-18-686-pae-i686-with-glibc2.36 (OpenSSL 3.0.11 19 Sep 2023, glibc 2.36) [debug] exe versions: ffmpeg 5.1.4-0 (setts), ffprobe 5.1.4-0 [debug] Optional libraries: certifi-2022.09.24, sqlite3-2.6.0 [debug] Proxy map: {} [debug] Loaded 1850 extractors [BiliBili] Extracting URL: https://www.bilibili.com/video/BV1Jh411d7kd?p=2&vd_source=98f63ab9b6403852e34607326fdf6819 [BiliBili] 1Jh411d7kd: Downloading webpage [BiliBili] BV1Jh411d7kd: Extracting videos in anthology [BiliBili] 201680039: Extracting chapters [BiliBili] Format(s) 1080P 高清, 720P 高清 are missing; you have to login or become premium member to download them. Use --cookies-from-browser or --cookies for the authentication. See https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp for how to manually pass cookies [debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id [info] Available formats for BV1Jh411d7kd_p2: ID EXT RESOLUTION FPS │ FILESIZE TBR PROTO │ VCODEC VBR ACODEC ABR ──────────────────────────────────────────────────────────────────────────────────── 30216 m4a audio only │ ≈1.12MiB 67k https │ audio only mp4a.40.2 67k 30232 m4a audio only │ ≈2.17MiB 130k https │ audio only mp4a.40.2 130k 30280 m4a audio only │ ≈2.17MiB 130k https │ audio only mp4a.40.2 130k 30016 mp4 640x360 29 │ ≈2.63MiB 157k https │ avc1.64001E 157k video only 100022 mp4 640x360 30 │ ≈1.27MiB 76k https │ av01.0.01M.08 76k video only 30011 mp4 640x360 30 │ ≈1.54MiB 92k https │ hev1.1.6.L120 92k video only 30032 mp4 852x480 29 │ ≈2.62MiB 157k https │ avc1.64001F 157k video only 100023 mp4 852x480 30 │ ≈1.50MiB 90k https │ av01.0.04M.08 90k video only 30033 mp4 852x480 30 │ ≈2.12MiB 127k https │ hev1.1.6.L120 127k video only $ ```

ytdl-org / youtube-dl