SuperSonicHub1 / TikTok-RSS

A simple TikTok RSS feed generator.
https://TikTok-RSS.supersonichub1.repl.co
The Unlicense
6 stars 5 forks source link

Coming up with a more permanant solution for accessing the TikTok API #3

Closed SuperSonicHub1 closed 2 years ago

SuperSonicHub1 commented 2 years ago

Right now, this project is real scuffed in that I don't generate my own user session. This requires someone to tap me on the shoulder every time this site breaks, which is annoying for both parties.

My plan is to study yt-dlp's TikTok extractor in order to get everything I need.

Fixes #2.

SuperSonicHub1 commented 2 years ago

Current lines of note:

SuperSonicHub1 commented 2 years ago

More lines of interest:

SuperSonicHub1 commented 2 years ago

I have something! Now all I need to do is to shove it into my current code.

import random
import re
import string
import time
from requests import Session, Response

session = Session() 
session.headers.update({})

# yt-dlp's constants
# https://github.com/yt-dlp/yt-dlp/blob/master/yt_dlp/extractor/tiktok.py#L29-L35
APP_VERSION = '20.1.0'
MANIFEST_APP_VERSION = '210'
APP_NAME = 'trill'
AID = 1180
API_HOSTNAME = 'api-h2.tiktokv.com'
UPLOADER_URL_FORMAT = 'https://www.tiktok.com/@%s'
WEBPAGE_HOST = 'https://www.tiktok.com/'

# My constants
URL = WEBPAGE_HOST + "@{}"
VIDEO_URL = URL + "/video/{}"

def get_user_id(username: str) -> str:
    res = session.get(URL.format(username), headers={
            'User-Agent': 'facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)'
        }
    )
    return re.search(r'snssdk\d*://user/profile/(\d+)', res.text).group(1)

def call_api(path: str, params: dict) -> Response:
    """
    https://github.com/yt-dlp/yt-dlp/blob/b31874334d5d68121a4a3f0d28dc1b39e5fca93b/yt_dlp/extractor/tiktok.py#L38-L84
    """
    params = {
        **params,
        'version_name': APP_VERSION,
        'version_code': MANIFEST_APP_VERSION,
        'build_number': APP_VERSION,
        'manifest_version_code': MANIFEST_APP_VERSION,
        'update_version_code': MANIFEST_APP_VERSION,
        'openudid': ''.join(random.choice('0123456789abcdef') for _ in range(16)),
        'uuid': ''.join([random.choice(string.digits) for _ in range(16)]),
        '_rticket': int(time.time() * 1000),
        'ts': int(time.time()),
        'device_brand': 'Google',
        'device_type': 'Pixel 4',
        'device_platform': 'android',
        'resolution': '1080*1920',
        'dpi': 420,
        'os_version': '10',
        'os_api': '29',
        'carrier_region': 'US',
        'sys_region': 'US',
        'region': 'US',
        'app_name': APP_NAME,
        'app_language': 'en',
        'language': 'en',
        'timezone_name': 'America/New_York',
        'timezone_offset': '-14400',
        'channel': 'googleplay',
        'ac': 'wifi',
        'mcc_mnc': '310260',
        'is_my_cn': 0,
        'aid': AID,
        'ssmix': 'a',
        'as': 'a1qwert123',
        'cp': 'cbfhckdckkde1',
    }
    session.cookies.set(
        "odin_tt",
        ''.join(random.choice('0123456789abcdef') for _ in range(160)),
        domain=API_HOSTNAME
    )
    sid_tt = session.cookies.get("sid_tt", domain=WEBPAGE_HOST)
    if sid_tt:
        session.cookies.set(
            "sid_tt",
            sid_tt,
            domain=API_HOSTNAME
        )
    return session.get(f"https://{API_HOSTNAME}/aweme/v1/{path}/",headers={
            'User-Agent': f'com.ss.android.ugc.trill/{MANIFEST_APP_VERSION} (Linux; U; Android 10; en_US; Pixel 4; Build/QQ3A.200805.001; Cronet/58.0.2991.0)',
            'Accept': 'application/json',
        },
        params=params
    )

def get_user(username: str) -> Response:
    """
    https://github.com/yt-dlp/yt-dlp/blob/b31874334d5d68121a4a3f0d28dc1b39e5fca93b/yt_dlp/extractor/tiktok.py#L529-L539
    """
    user_id = get_user_id(username)
    params = {
        'user_id': user_id,
        'count': 21,
        'max_cursor': 0,
        'min_cursor': 0,
        'retry_type': 'no_retry',
        # Some endpoints don't like randomized device_id, so it isn't directly set in _call_api.
        'device_id': ''.join(random.choice(string.digits) for _ in range(19)),
    }
    return call_api("aweme/post", params)

if __name__ == "__main__":
    print(get_user("emiru").json())
oluwabajio commented 2 years ago

Thanks for this. Please can you give me the correct url structure. I tried the below and it didnt work.

https://api-h2.tiktokv.com/aweme/v1/v09044g40000c6n1e5bc77udh7itcncg

SuperSonicHub1 commented 2 years ago

@oluwabajio I'm a bit confused by your question. Can you back up a bit and tell me what you're trying to achieve?

oluwabajio commented 2 years ago

Thanks for your response. I was looking at your code, but i dont really understand python.

From what i can deduce, in your call_api function, you are generating a url similar to this "https://api-h2.tiktokv.com/aweme/v1/".

Then you are then scraping your required info from the url you just generated.

So can you kindly share the full url.

For instance this tiktok video (https://www.tiktok.com/@pauldgoodguy/video/6963175937521323269) becomes https://api-h2.tiktokv.com/aweme/v1/6963175937521323269

Am i right?

SuperSonicHub1 commented 2 years ago

@oluwabajio It seems you have the wrong idea. My code gets a list of videos a creator of my choosing has uploaded to TikTok. What you want is information for an individual video. You're going to have to go do research on that. I would suggest that you read the code from yt-dlp that I've linked to above, but seeing as you don't have too much experience with Python, you're on your own.

oluwabajio commented 2 years ago

Thanks. i would check out ytldp.