ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
130.2k stars 9.82k forks source link

TikTok.com - New URL #23264

Open adrgru opened 4 years ago

adrgru commented 4 years ago

Checklist

Verbose log

for profiles, playlists and songs:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.tiktok.com/@dannero', '-v']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2019.11.28
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg git-2019-11-09-bb190de, ffprobe git-2019-11-09-bb190de
[debug] Proxy map: {}
[generic] @dannero: Requesting header
WARNING: Falling back on generic information extractor.
[generic] @dannero: Downloading webpage
[generic] @dannero: Extracting information
ERROR: Unsupported URL: https://www.tiktok.com/@dannero
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwy0zjfmc\build\youtube_dl\YoutubeDL.py", line 796, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwy0zjfmc\build\youtube_dl\extractor\common.py", line 530, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwy0zjfmc\build\youtube_dl\extractor\generic.py", line 3347, in _real_extract
youtube_dl.utils.UnsupportedError: Unsupported URL: https://www.tiktok.com/@dannero

for videos:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.tiktok.com/@markoterzo/video/6762207861918420230', '-v']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2019.11.28
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg git-2019-11-09-bb190de, ffprobe git-2019-11-09-bb190de
[debug] Proxy map: {}
[generic] 6762207861918420230: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 6762207861918420230: Downloading webpage
[generic] 6762207861918420230: Extracting information
[download] Downloading playlist: M A R K O  on TikTok
[generic] playlist M A R K O  on TikTok: Collected 1 video ids (downloading 1 of them)
[download] Downloading video 1 of 1
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://v19.muscdn.com/99bdb82ff18a6720f09602b8de495194/5de1edf9/video/tos/maliva/tos-maliva-v-0068/af15acd0acb44d8bbf38744b1857631d/?a=1233&br=1547&cr=0&cs=0&dr=0&ds=3&er=&l=2019112922200201011511513317E33486&lr=tiktok_m&qs=0&rc=ank1NDVyPGY2cTMzZDczM0ApNmlkO2hoOTtnNzg4ZTU6aWdoaWwzZG0tLWBfLS1eMTZzczYwLzMvNGNhLV4tMTFfMWE6Yw=='
[download] Destination: M A R K O  on TikTok-6762207861918420230.unknown_video
[download] 100% of 1.36MiB in 00:02
[download] Finished downloading playlist: M A R K O  on TikTok

Description

There is a new format of TikTok URLs that was recently introduced. None of them work in youtube-dl.

The "Discover" page on TikTok.com (https://www.tiktok.com/discover) features the following - and none of them work:

skyme5 commented 4 years ago

There is already a PR #22838 that fixes this issue.

PR working status for youtube-dl

TikTok Extractor Python 2 TikTok Extractor Python 3

bpenven commented 3 years ago

There is already a PR #22838 that fixes this issue.

No it does not.

bpenven commented 3 years ago

@yan12125 this one is closed...

yan12125 commented 3 years ago

this one is closed...

Hmm not sure what you mean

image

bpenven commented 3 years ago

@yan12125 I meant this one: #26094 sorry for the misunderstanding.

notpushkin commented 3 years ago

For anybody wondering: pipx install https://github.com/runraid/youtube-dl/archive/tiktokwatermarkless.zip --force to install a version by @runraid (#25895) which can get download tiktoks without a watermark!

(Inspect changes here. If you don't use pipx yet, give it a try! Plain pip should work fine too.)

dtaust commented 3 years ago

@bpenven #22838 actually does fix this, I've built 2020.07.28 with the changes to extractors.py and tiktok.py patched in.

@skyme5 I'm not sure if it's just my setup but in TikTokBaseIE/tiktok.py this doesn't work, the url variable ends up empty:

   formats = []
    formats.append({
        'url': try_get(video_info, lambda x: x['video']['urls'][0], str),
        'ext': 'mp4',
        'height': height,
        'width': width
    })

I've solved that by doing this:

    formats = []
    video_urls = try_get(video_info, lambda x: x['video']['urls'], list)
    video_url = video_urls[0]
    formats.append({
        'url': video_url,
        'ext': 'mp4',
        'height': height,
        'width': width
    })

I don't really code in python so I'm not really sure why your code wouldn't work to begin with, it seems perfectly valid to me.

If referencing the 0th element of a list like that doesn't work you've done the same thing here as well:

'thumbnail': try_get(video_info, lambda x: x['covers'][0], str),

but you probably just end up with no thumbnail whereas url being empty causes the below:

WARNING: "url" field is missing or empty - skipping format, there is an error in extractor Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/local/bin/youtube-dl/main.py", line 19, in File "/usr/local/bin/youtube-dl/youtube_dl/init.py", line 474, in main

File "/usr/local/bin/youtube-dl/youtube_dl/init.py", line 464, in _real_main

File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 2019, in download File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 808, in extract_info File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 863, in process_ie_result File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 1589, in process_video_result

skyme5 commented 3 years ago

@dtaust, Extractor in #22838 doesn't work on Python 2 (because of lambda expressions). Try Python 3.

dtaust commented 3 years ago

Doesn't that mean that pull won't be merged because the youtube-dl project supports python 2.6/2.7?

From the project page:

It requires the Python interpreter (2.6, 2.7, or 3.2+), and it is not platform specific.

[debug] Python version 2.7.16 (CPython) This is the versioning on my test machine and with the small changes I made the lambda expressions work fine on 2.7.

skyme5 commented 3 years ago

I think this is due to comparison of expected_type in try_get being different for Python 2 and 3, in Python 2 it has the type unicode while Python 3 has str.

Removing the expected_type comparison resolves the problem.

formats.append({
    'url': try_get(video_info, lambda x: x['video']['urls'][0]),
...

Fixed in #22838

Nolaan commented 3 years ago

Looks like Tiktok updated their strategies again. By inspecting the requests, we can see that a sort of verification/captcha is going on. Screenshot_20200803_120731

liamengland1 commented 3 years ago

Looks like Tiktok updated their strategies again.

That has nothing to do with downloading the video.

lihuelworks commented 3 years ago

Still have issues with this (using ver 2020.07.28). Any other info I can add to help?

youtube-dl 'https://www.tiktok.com/@rogeliobucktronar/video/6820475866087197958'
[generic] 6820475866087197958: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 6820475866087197958: Downloading webpage
[generic] 6820475866087197958: Extracting information
ERROR: Unsupported URL: https://www.tiktok.com/@rogeliobucktronar/video/6820475866087197958
skyme5 commented 3 years ago

@lihuelworks, PR #22838, fixes this issue which is still not merged yet, you can install this PR described here or you can replace the youtube_dl/extractor/tiktok.py in installation folder with this.

Nolaan commented 3 years ago

Looks like Tiktok updated their strategies again.

That has nothing to do with downloading the video.

I 'm not sure I'm getting you @llacb47 . The download fails because of the captcha but that has nothing to do with downloading the video? What am I getting wrong?

Nolaan commented 3 years ago

@skyme5 have tried @lihuelworks url before posting? The very reason of my first post is that I was under the impression that it doesn't work either. I just tried and still doesn't work. Am I doing something wrong?

Best regards

Screenshot_20200811_234451

liamengland1 commented 3 years ago

Looks like Tiktok updated their strategies again.

That has nothing to do with downloading the video.

I 'm not sure I'm getting you @llacb47 . The download fails because of the captcha but that has nothing to do with downloading the video? What am I getting wrong?

Look at the source code, the video URL is literally there. There is no captcha bypass needed to get the video.

Nolaan commented 3 years ago

My bad (or not) I had to checkout the tiktok branch to make it work and didn't know it existed. Snippet should be :

git clone https://github.com/skyme5/youtube-dl.git
cd youtube-dl/youtube_dl
git checkout origin/tiktok
python __main__.py url
MohamedElashri commented 3 years ago

My bad (or not) I had to checkout the tiktok branch to make it work and didn't know it existed. Snippet should be :

git clone https://github.com/skyme5/youtube-dl.git
cd youtube-dl/youtube_dl
git checkout origin/tiktok
python __main__.py url

can it be used to download all user videos ?

Nolaan commented 3 years ago

@MohamedElashri yes it worked afterwards

lihuelworks commented 3 years ago

It works! Hope it gets implemented soon, I'm really missing that functionality on mpv.

notpushkin commented 3 years ago

pipx install https://github.com/skyme5/youtube-dl/archive/tiktok.zip --force to install the version from #22838. (Inspect changes here. If you don't use pipx yet, give it a try! Plain pip should work fine too.)

fluffy0223024c commented 3 years ago

For anybody wondering: pipx install https://github.com/runraid/youtube-dl/archive/tiktokwatermarkless.zip --force to install a version by @runraid (#25895) which can get download tiktoks without a watermark!

(Inspect changes here. If you don't use pipx yet, give it a try! Plain pip should work fine too.)

I got the error Package cannot be a url

AmperAndSand commented 3 years ago

For anybody wondering: pipx install https://github.com/runraid/youtube-dl/archive/tiktokwatermarkless.zip --force to install a version by @runraid (#25895) which can get download tiktoks without a watermark! (Inspect changes here. If you don't use pipx yet, give it a try! Plain pip should work fine too.)

I got the error Package cannot be a url

Try using pip instead of pipx.

fluffy0223024c commented 3 years ago

For anybody wondering: pipx install https://github.com/runraid/youtube-dl/archive/tiktokwatermarkless.zip --force to install a version by @runraid (#25895) which can get download tiktoks without a watermark! (Inspect changes here. If you don't use pipx yet, give it a try! Plain pip should work fine too.)

I got the error Package cannot be a url

Try using pip instead of pipx.

The command worked with pip3 (python3-pip). Thank you!

notpushkin commented 3 years ago

The only problem with installing these versions is that you wouldn't get updates from main youtube-dl branch (right now YouTube support is broken in both I think) until @skyme5 @runraid rebase their branches. If you only need TikTok support, you'll be fine though.

Al-Muhandis commented 3 years ago

The only problem with installing these versions is that you wouldn't get updates from main youtube-dl branch (right now YouTube support is broken in both I think) until @skyme5 @runraid rebase their branches. If you only need TikTok support, you'll be fine though.

But watermarkless fork of @runraid do not works now (

notpushkin commented 3 years ago

@Al-Muhandis It did work for older videos (before July 2020 I think) last time I checked :thinking:

Al-Muhandis commented 3 years ago

@Al-Muhandis It did work for older videos (before July 2020 I think) last time I checked 🤔

Yes, I noticed. It doesn't work for new videos (

dtaust commented 3 years ago

@notpushkin the solution for that is to clone the current ytdl-org/youtube-dl repo and replace youtube_dl/extractor/tiktok.py with skyme5's one and then make/install.

notpushkin commented 3 years ago

@dtaust Makes sense, but I'm too lazy personally :') (which is the reason I came up with the one-liners above)

VasiliPupkin256 commented 3 years ago

@blackjack4494 why can't it be merged here?

The recent youtube-dl version was released just recently and the tiktok support is not fixed in it.

skyme5 commented 3 years ago

@VasiliPupkin256 that PR has been opened for nearly 328 days and still not merged by @dstftw.

youtube-dl with tiktok download support is available at blackjack4494/youtube-dlc

sjmueller commented 3 years ago

Really confused about why tiktok remains broken for so long. Can Sergey @dstftw @remitamine or any of the other maintainers give us some visibility into what's happening and why? Let's make it easy, check one of the following:

swan911 commented 3 years ago

Hi! I downloaded link https://v16-web-newkey.tiktokcdn.com/blabla in browser and I got error 403. After I added header "Referer: https://www.tiktok.com/blabla" in my curl downloader example and I can downloaded tiktok video. We need to add header Referer when you download media link

VasiliPupkin256 commented 3 years ago

@sjmueller:

thebigdalt commented 3 years ago

For others as a workaround, here are curl and wget examples:

curl --output "video.mp4" --referer "https://www.tiktok.com/@user/video/123456789" -O "https://v16-web-newkey.tiktokcdn.com/......"
wget --output-document="video.mp4" --referer="https://www.tiktok.com/@user/video/123456789" "https://v16-web-newkey.tiktokcdn.com/....."
jinthoa commented 3 years ago

For others as a workaround, here are curl and wget examples:

curl --output "video.mp4" --referer "https://www.tiktok.com/@user/video/123456789" -O "https://v16-web-newkey.tiktokcdn.com/......"
wget --output-document="video.mp4" --referer="https://www.tiktok.com/@user/video/123456789" "https://v16-web-newkey.tiktokcdn.com/....."

Where does "https://v16-web-newkey.tiktokcdn.com/....." come from ?

thebigdalt commented 3 years ago

Where does "https://v16-web-newkey.tiktokcdn.com/....." come from ?

With the TikTok video open in the browser, right click it and inspect the element. Look for an element with video in it and a link similar to the above. That link is the actual video being requested, which is served by TikTok's CDN. However, the CDN requires the referral of the original link as it won't give you access to the video without it, hence the above workaround.

sjmueller commented 3 years ago

If installing via sudo pip3 install youtube-dl, is there any way to patch the files afterwards? Seems like most of the source code goes to /usr/local/lib/python3.8/dist-packages/youtube_dl while the executable resides in /usr/local/bin/youtube-dl.

But if extractors.py and tiktok.py is updated, does the original /usr/local/bin/youtube-dl reference those changes? Wouldn't the executable need to be rebuilt?

skyme5 commented 3 years ago

@sjmueller It is just a script file pointing to main installation of youtube-dl, Yes, you can directly make changes to extractors.py, and overwriting tiktok.py but these changes are lost whenever you update your installation of youtube-dl.

selfisekai commented 3 years ago

Issue fixed as well in haruhi-dl, a libre fork of youtube-dl.

skyme5 commented 3 years ago

@selfisekai @sjmueller I think tiktok.com is using CAPTCHA to prevent scraping.

selfisekai commented 3 years ago

@skyme5 they do use some 'verify' tokens for API requests, and user profiles are unavailable on hdl for this reason, but the JSON video metadata are contained inside the HTML document, probably this thing: https://nextjs.org/docs/api-reference/data-fetching/getInitialProps

Also, for some reason, using the new URL scheme (https://www.tiktok.com/@puczirajot/video/6878766755280440578) didn't work, but https://www.tiktok.com/share/video/6878766755280440578 sets 3 cookies and does a redirect to the same page (+ ?source=h5_t, but adding it to the URL doesn't help). I guess these cookies are the reason for this.

My logs for downloading a video:

 laura@iino  ~/haruhi-dl   master ●  http_proxy=http://127.0.0.1:8080 python3 -m haruhi_dl --no-check-certificate "https://www.tiktok.com/@markoterzo/video/6762207861918420230" -v
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--no-check-certificate', 'https://www.tiktok.com/@markoterzo/video/6762207861918420230', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] haruhi-dl version 2020.11.16
[debug] Git HEAD: c55393ce4
[debug] Python version 3.9.0 (CPython) - Linux-5.8.17_1-x86_64-with-glibc2.30
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1
[debug] Proxy map: {'http': 'http://127.0.0.1:8080', 'https': 'http://127.0.0.1:8080'}
[tiktok] 6762207861918420230: Downloading webpage
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://v16-web.tiktok.com/video/tos/useast2a/tos-useast2a-ve-0068/449076372907418aa65c8235093bd867/?a=1988&br=3094&bt=1547&cr=0&cs=0&cv=1&dr=0&ds=3&er=&expire=1605642934&l=2020111713552701019021820147016369&lr=tiktok_m&mime_type=video_mp4&policy=2&qs=0&rc=ank1NDVyPGY2cTMzZDczM0ApNmlkO2hoOTtnNzg4ZTU6aWdoaWwzZG0tLWBfLS1eMTZzczYwLzMvNGNhLV4tMTFfMWE6Yw%3D%3D&signature=132b3ae5b648b1017bf3786a539199a2&tk=tt_webid_v2&vl=&vr='
[download] Destination: M A R K O -6762207861918420230.mp4
[download] 100% of 1.36MiB in 00:00

haruhi-dl tiktok extractor source code: https://git.sakamoto.pl/laudom/haruhi-dl/-/blob/master/haruhi_dl/extractor/tiktok.py

makew0rld commented 3 years ago

This new URL scheme is supported as of version 2020.11.29, see fb626c05867deab04425bad0c0b16b55473841a2. I was able to successfully download a video using a URL like https://www.tiktok.com/@user/video/123456789.

I think this issue should be closed in favour of opening specific issues like about code 403, or supporting other URLs like profiles. This new TikTok URL scheme is supported in general.