drawrowfly / tiktok-scraper

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
4.33k stars 795 forks source link

[Feature Request] Live Stream download ability #258

Open sideloading opened 4 years ago

sideloading commented 4 years ago

Is your feature request related to a problem? Please describe. No.

Describe the solution you'd like It would be great to have the ability to download ongoing live from TikTok. Adding a new parameter to check for ongoing live, then download the live.

Information When a user is live, there is a m3u8 link inside the JSON.

The following endpoints have the room_data key if a user is live:

/aweme/v1/user/profile/other/?parameter=xyz&user_id=2467441 /aweme/v1/aweme/post/?parameter=xyz&user_id=2467441

Inside "room_data": there is a .m3u8 link (e.g. http://pull-hls-l1.tiktokcdn.com/stage/stream-2989306508973178895/playlist.m3u8) which can then be processed with ffmpeg/youtube-dl for downloading.

Example JSON response of a user that is currently live (truncated, via post endpoint):

{
    "status_code": 0,
    "user": {
        "twitter_name": "",
        "district": "",
        "commerce_user_info": {
            "has_ads_entry": false,
            "ad_revenue_rits": null
        },
        "location": "",
        "room_id": 6848192619973970693,
        "room_data": "{\"id\":6848176096593382149,\"owner_user_id\":24674412,\"os_type\":1,\"client_version\":160602,\"stream_url\":{\"live_core_sdk_data\":{\"pull_data\":{\"stream_data\":\"{\\\"common\\\":{\\\"session_id\\\":\\\"202007111134450101902092012B530543\\\",\\\"query\\\":{\\\"infos\\\":\\\"redacted\\\"}},\\\"data\\\":{\\\"origin\\\":{\\\"main\\\":{\\\"flv\\\":\\\"http://pull-flv-l1.tiktokcdn.com/stage/stream-2989306508973178895.flv\\\",\\\"hls\\\":\\\"http://pull-hls-l1.tiktokcdn.com/stage/stream-2989306508973178895/playlist.m3u8\\\",\\\"cmaf\\\":\\\"\\\",\\\"dash\\\":\\\"\\\",\\\"sdk_params\\\":\\\"{\\\\\\\"Query\\\\\\\":[\\\\\\\"isp\\\\\\\",\\\\\\\"peer_isp\\\\\\\",\\\\\\\"country\\\\\\\",\\\\\\\"peer_country\\\\\\\",\\\\\\\"province\\\\\\\",\\\\\\\"peer_province\\\\\\\",\\\\\\\"device_id\\\\\\\",\\\\\\\"peer_device_id\\\\\\\",\\\\\\\"device_type\\\\\\\",\\\\\\\"peer_device_type\\\\\\\",\\\\\\\"access_code\\\\\\\",\\\\\\\"peer_access_code\\\\\\\",\\\\\\\"os_version\\\\\\\",\\\\\\\"peer_os_version\\\\\\\",\\\\\\\"default_bitrate\\\\\\\",\\\\\\\"is_transmission_optimize=1\\\\\\\"],\\\\\\\"VCodec\\\\\\\":\\\\\\\"h265\\\\\\\",\\\\\\\"gop\\\\\\\":4,\\\\\\\"resolution\\\\\\\":\\\\\\\"360x640\\\\\\\",\\\\\\\"vbitrate\\\\\\\":1000000}\\\"}}}}\",\"options\":{\"default_quality\":{\"name\":\"Original\",\"sdk_key\":\"origin\"}}}}}}",
        "avatar_thumb": {
            "uri": "musically-maliva-obj/04138d271dd6730b472a8ca84121cb9d",
            "url_list": ["https://p16-va-tiktok.ibyteimg.com/img/musically-maliva-obj/04138d271dd6730b472a8ca84121cb9d~c5_100x100.webp"]
        },

Same response when user is not live (truncated, no room_data, room_id=0):

{
    "status_code": 0,
    "user": {
        "forward_count": 0,
        "is_block": false,
        "room_id": 0,
        "district": "",
        "watch_status": false,
        "avatar_thumb": {
            "uri": "musically-maliva-obj/0a0242e5f906bf9301f859c568b5ecc4",
            "url_list": ["https://p16-va-tiktok.ibyteimg.com/img/musically-maliva-obj/0a0242e5f906bf9301f859c568b5ecc4~c5_100x100.webp"]
        }

I also had a Python tool for downloading livestreams. but it's broken now so hopefully you can find some of the code useful.

def download_live(target_room_id):
    if not os.path.exists(os.path.join(ptts.dl_path, ptts.tt_target_user, 'broadcasts')):
        os.makedirs(os.path.join(ptts.dl_path, ptts.tt_target_user, 'broadcasts'))

    download_path = os.path.join(ptts.dl_path, ptts.tt_target_user, 'broadcasts')
    logger.separator()
    logger.info("Checking for ongoing livestreams.")
    logger.separator()
    s = requests.Session()
    s.headers.update({
        'User-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Firefox/60.0',
    })
    r = s.get(Constants.LIVE_WEB_URL.format(target_room_id))
    r.raise_for_status()
    live = r.text
    live_room_id = re.search(r'(stream-)(.*)(?=\/playlist)', live)
    if live_room_id:
        live_hls_url = Constants.LIVE_HLS_ENDP.format(live_room_id[2])
        logger.info("HLS url: {:s}".format(live_hls_url))
        logger.separator()
        logger.info("HLS url retrieved. Calling youtube-dl.")
        helpers.call_ytdl(live_hls_url, os.path.join(download_path, str(live_room_id[2]) + "_" + ptts.epochtime))
    else:
        logger.info("There is no available livestream for this user.")
        logger.separator()
drawrowfly commented 4 years ago

This endpoint is related to the Mobile API!

This library supports only Web Api!

In order to use Mobile API you need valid device_id, iid values and valid signature in the header depending from the app version.

  1. device_id and iid can be extracted by listening traffic from tiktok mobile app
  2. or there is other way, that requires to send encoded payload to the tiktok api and in response you can get device_id and iid values

Currently i don't have plans to add mobile api support to this repo as it can be abused by spammers and bot lovers

sideloading commented 4 years ago

@drawrowfly I see. I did some further investigating and found that this playlist.m3u8 is also located on a share page under the shared URL (via the app: sharing livestream->copy link)

URL https://m.tiktok.com/share/live/6848460154006932229/ -> viewing source:

<link data-react-helmet="true" rel="dns-prefetch" href="//s0.ipstatp.com"/><link data-react-helmet="true" rel="dns-prefetch" href="//s16.tiktokcdn.com"/><link data-react-helmet="true" rel="dns-prefetch" href="//v16.tiktokcdn.com"/><link data-react-helmet="true" rel="dns-prefetch" href="//p16-va.tiktokcdn.com"/><link data-react-helmet="true" rel="dns-prefetch" href="//www.google-analytics.com"/><link data-react-helmet="true" rel="dns-prefetch" href="//stats.g.doubleclick.net"/><link data-react-helmet="true" rel="shortcut icon" href="//s16.tiktokcdn.com/musical/resource/wap/static/image/logo_144c91a.png?v=2" type="image/x-icon"/><link rel="stylesheet" href="//s16.tiktokcdn.com/tiktok/falcon/static/css/43.bundle.44d316d5.css"/><script>window.__INIT_PROPS__ = {"/share/live/:id":{"$isMobile":true,"$isIOS":["(iPhone; CPU iPhone OS 13_3_1 like Mac OS X",null],"$isAndroid":false,"$origin":"https://m.tiktok.com","$pageUrl":"/share/live/6848460154006932229/","$region":"AU","$language":"en","$originalLanguage":"en","$os":"ios","$reflowType":"m","$appId":1233,"$botType":"others","$appType":"m","$downloadLink":{"amazon":{"visible":true,"normal":"https://www.amazon.com/dp/B0117U0G3M/"},"google":{"visible":true,"normal":"https://www.tiktok.com/download-link/af/com.zhiliaoapp.musically"},"apple":{"visible":true,"normal":"https://www.tiktok.com/download-link/af/id835599320"}},"$config":{"covidBanner":{"open":true,"url":"https://www.tiktok.com/safety/resources/covid-19","background":"rgba(125,136,227,1)"},"bytedanceLink":{"linkVisible":true,"overrideUrl":""}},"$baseURL":"m.tiktok.com","pageState":{"regionAppId":1233,"os":"ios","region":"AU","baseURL":"m.tiktok.com","appType":"m","fullUrl":"https://m.tiktok.com/share/live/6848460154006932229/"},"liveData":{"RoomId":"6848460154006932229","Status":"2","Title":"come talk w me real quick ((:","LiveUrl":"http://pull-hls-l1.tiktokcdn.com/stage/stream-2989310946276540431_or4/playlist.m3u8","OwnerInfo":{"Id":"9192363","ShortId":"21452519056","UniqueId":"tom.cruz","Nickname":"julia cruz","AvatarThumb":{"Uri":"100x100/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3","UrlList":["https://p16-va-tiktok.ibyteimg.com/img/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3~c5_100x100.jpeg"]},"AvatarMedium":{"Uri":"720x720/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3","UrlList":["https://p16-va-tiktok.ibyteimg.com/img/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3~c5_720x720.jpeg"]},"AvatarLarger":{"Uri":"1080x1080/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3","UrlList":["https://p16-va-tiktok.ibyteimg.com/img/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3~c5_1080x1080.jpeg"]},"Signature":"BLM\nvancouver-ish \n12.1k of u guys!!🥺🥺","CreateTime":"1441970524","Verified":false,"SecUid":"MS4wLjABAAAAaugAOm_l0o2BtiMeKrXFvfOlttZw1uu6k5okK6TtXZY","Secret":false,"Ftc":false,"Relation":0,"OpenFavorite":false,"BioLink":null,"CommerceUserInfo":null},"LiveRoomStats":{"UserCount":36,"EnterCount":902,"DiggCount":0},"coverUrl":{"Uri":"musically-maliva-obj/10d730df23e9ba953b81946fddd402b3","UrlList":["https://p16-va-default.akamaized.net/obj/musically-maliva-obj/10d730df23e9ba953b81946fddd402b3"]}},"shareUser":{"secUid":"","userId":"","uniqueId":"","nickName":"","signature":"","covers":[],"coversMedium":[],"coversLarger":[],"isSecret":false,"relation":-1},"shareMeta":{"title":"julia cruz Live on TikTok","desc":"@tom.cruz 12331 Followers, 294 Following, 268400 Likes - Live on TikTok"},"statusCode":0}}</script>
        </head>

Is this something that's able to be checked through the Web API?

drawrowfly commented 4 years ago

Alright i will take a look at it closer

sideloading commented 4 years ago

Thanks, let me know if you find anything! I've found the easiest way to find a live account is by swiping through the 'For You' page in the app until you get someone who is live.

liamengland1 commented 4 years ago

Here's a URL that's working right now: https://t.tiktok.com/share/live/6852379743099030277/

You could use a regex to match the stream_id: 2989371956691992591

then you can output URLs for the stream in two formats:

HLS: http://{{HLS domain}}/stage/stream-{{stream_id}}/playlist.m3u8

{{HLS domain}} could be pull-hls-f5.tiktokcdn.com, pull-hls-f1-ab.tiktokcdn.com, or pull-hls-l1.tiktokcdn.com. Only the one returned in source code will work for the particular stream.

FLV: http://pull-f5-ab.tiktokcdn.com/stage/stream-{{stream_id}}.flv

There is video in the FLV stream, I can't see it. Apparently tiktok uses some hacky codec: https://trac.ffmpeg.org/ticket/6389

Edit: ignore all of the above ramblings lol. Easiest method is to extract m3u8 URL from source code.

sideloading commented 4 years ago

@llacb47 the HLS .m3u8 should be downloadable with youtube-dl/ffmpeg, and compiled into an mp4 when the live ends.

@drawrowfly Would you be able to re-open this issue seeing as it may be possible to implement?

UnCrevard commented 4 years ago

I've done this https://github.com/UnCrevard/tiktok-live

Hope you'll like it.

drawrowfly commented 4 years ago

@UnCrevard the best thing you can do here is to create pull request to already existing package with multiple features instead of advertising separate package with one purpose

sideloading commented 4 years ago

@UnCrevard Is there any way to check if a user is livestreaming without relying on the livestream share-link?

ChristophKrause commented 2 years ago

Tiktok changed a lot https://www.tiktok.com/@USERNAME/live

opens the live stream page of a user.

But I can not find the live stream url on this page or its scripts :(

spinningsand commented 2 years ago

Are there any more updates on this?

spinningsand commented 2 years ago

I also noticed that tiktok saves lives to replay now, so I didn't know if that could somehow be scraped as well.