Russell-Newton / TikTokPy

Extract data from TikTok without needing any login information or API keys.
https://pypi.org/project/tiktokapipy/
MIT License
192 stars 24 forks source link

[BUG] using tiktok challange failed #40

Closed vqoley closed 1 year ago

vqoley commented 1 year ago

Describe the bug i try run and filter for video under 60 seconds only. After found few data like 5-10 its failed. below is error

Traceback (most recent call last):
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 363, in _scrape_data
    data = self._extract_and_dump_data(content, extras_json, data_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 394, in _extract_and_dump_data
    parsed = data_model.parse_raw(data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic\main.py", line 549, in pydantic.main.BaseModel.parse_raw
  File "C:\Python311\Lib\site-packages\tiktokapipy\models\raw_data.py", line 130, in parse_obj
    return super().parse_obj(obj)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic\main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic\main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for VideoResponse
ItemModule -> 7195422072552639771 -> music -> authorName
  field required (type=value_error.missing)
Traceback (most recent call last):
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 363, in _scrape_data
    data = self._extract_and_dump_data(content, extras_json, data_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 394, in _extract_and_dump_data
    parsed = data_model.parse_raw(data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic\main.py", line 549, in pydantic.main.BaseModel.parse_raw
  File "C:\Python311\Lib\site-packages\tiktokapipy\models\raw_data.py", line 130, in parse_obj
    return super().parse_obj(obj)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic\main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic\main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for VideoResponse
ItemModule -> 7195422072552639771 -> music -> authorName
  field required (type=value_error.missing)
Traceback (most recent call last):
  File "C:\xampp8\htdocs\tiktokupload\runs.py", line 94, in <module>
    fromchallage(keyword_value,length_value)
  File "C:\xampp8\htdocs\tiktokupload\fungsi.py", line 70, in fromchallage
    for video in challenge.videos:
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 90, in __next__
    out = self.fetch(self._next_up)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 103, in fetch
    return self._api.video(video_link(self.light_models[idx].id))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 316, in video
    response, api_extras = self._scrape_data(
                           ^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 370, in _scrape_data
    raise TikTokAPIError(
tiktokapipy.TikTokAPIError: Data scraping unable to complete in 30.0s (retries: 1)

i try use:

proxy setting:

proxy={
   "server": "http://my proxy ip here:8080",
   "username": "",
   "password": ""

Additional context Yesterday and few days ago i use its working fine. no error. but starting today got error. can i know how to solve it?

Russell-Newton commented 1 year ago

Please try again with release 0.11.1.post2

vqoley commented 1 year ago

thanks, after update looks like its working but not long enough meaning after update i got more result than yesterday then i got error like this

downloading > FLICKER FAST HAND HOOK
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 702k/702k [00:00<00:00, 4.98MB/s]
Download finished
Traceback (most recent call last):
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 355, in _scrape_data
    page.wait_for_selector("#SIGI_STATE", state="attached")
  File "C:\Python311\Lib\site-packages\playwright\sync_api\_generated.py", line 8286, in wait_for_selector
    self._sync(
  File "C:\Python311\Lib\site-packages\playwright\_impl\_sync_base.py", line 104, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_page.py", line 368, in wait_for_selector
    return await self._main_frame.wait_for_selector(**locals_to_params(locals()))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_frame.py", line 322, in wait_for_selector
    await self._channel.send("waitForSelector", locals_to_params(locals()))
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 461, in wrap_api_call
    return await cb()
           ^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 96, in inner_send
    result = next(iter(done)).result()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.TimeoutError: Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for locator("#SIGI_STATE")
============================================================
Traceback (most recent call last):
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 355, in _scrape_data
    page.wait_for_selector("#SIGI_STATE", state="attached")
  File "C:\Python311\Lib\site-packages\playwright\sync_api\_generated.py", line 8286, in wait_for_selector
    self._sync(
  File "C:\Python311\Lib\site-packages\playwright\_impl\_sync_base.py", line 104, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_page.py", line 368, in wait_for_selector
    return await self._main_frame.wait_for_selector(**locals_to_params(locals()))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_frame.py", line 322, in wait_for_selector
    await self._channel.send("waitForSelector", locals_to_params(locals()))
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 461, in wrap_api_call
    return await cb()
           ^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\playwright\_impl\_connection.py", line 96, in inner_send
    result = next(iter(done)).result()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.TimeoutError: Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for locator("#SIGI_STATE")
============================================================
Traceback (most recent call last):
  File "C:\xampp8\htdocs\tiktokupload\runs.py", line 94, in <module>
    fromchallage(keyword_value,length_value)
  File "C:\xampp8\htdocs\tiktokupload\fungsi.py", line 70, in fromchallage
    for video in challenge.videos:
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 90, in __next__
    out = self.fetch(self._next_up)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 103, in fetch
    return self._api.video(video_link(self.light_models[idx].id))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 316, in video
    response, api_extras = self._scrape_data(
                           ^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\tiktokapipy\api.py", line 370, in _scrape_data
    raise TikTokAPIError(
tiktokapipy.TikTokAPIError: Data scraping unable to complete in 30.0s (retries: 1)

version after update:

Name: tiktokapipy
Version: 0.1.11.post2
Summary: Asyncio TikTok data scraping tool
Home-page:
Author:
Author-email: Russell Newton <russell.newton01@gmail.com>
License:
Location: C:\Python311\Lib\site-packages
Requires: playwright, pydantic, requests
Required-by:

can i know what is the problem with it? or its problem with my code? my code below

def fromchallage(keyword,saat):
    with TikTokAPI(navigation_retries=1,navigation_timeout=30,proxy={
   "server": "http://my proxy ip:8080",
   "username": "",
   "password": ""
 }) as api:
        challenge = api.challenge(keyword)
        for video in challenge.videos:
            time.sleep(2)
            times = int(time.time())
            id = video.id
            # stats = video.stats
            # create_time = video.create_time
            description = video.desc
            # video_cover = video.video.cover
            video_duration = video.video.duration
            # video_ratio = video.video.ratio
            # video_origin_cover = video.video.origin_cover
            # video_dynamic_cover = video.video.dynamic_cover
            author = video.author
            if video_duration <= saat: 
                #print(f"id : {id}")
                link = f"https://www.tiktok.com/@{author}/video/{id}"
                check_video_ori = check_exist("SELECT * FROM video_ori WHERE author = ? AND video_id = ?", (author,id))
                if check_video_ori == 300:
                    print(f"video id {id} already exist. skipping...")
                    continue
                running(link,author,times,check_video_ori,id)
Russell-Newton commented 1 year ago

This sort of issue arises when TikTokPy can't load a page correctly. Specifically, one of the videos tagged with the challenge specified isn't loading right. Normally when this happens, it's because TikTok throws up a captcha or the page just takes too long to load. I've found that this can sometimes be avoided by using a proxy or increasing navigation retries. Unfortunately I don't have a fix-all solution for this error.


There are some suggestions I can give you:

Russell-Newton commented 1 year ago

As a side note, a video's ID is sufficient to identify a video. Each ID is unique, so you might be able to get away with just using that in your database. You need to grab video data with TikTokPy if you want duration and author info, but if you decide you don't need it, challenge.videos.light_models might be sufficient for your needs.

You can load a video by ID like this:

video = api.video(video_link(id))

The video_link function is in the tiktokapipy.models.video module

Russell-Newton commented 1 year ago

In version 0.1.13, I changed the TimeoutError handing to not print it as a traceback but to instead say Reached navigation timeout. Retrying.... This removes console clutter significantly. It would be worth trying to update to 0.1.13, increasing navigation_timeout and navigation_retries, and trying again.

Russell-Newton commented 1 year ago

This issue should be resolved in the official version 0.2.0 release. I'm closing this issue as resolved. If for some reason it doesn't work for you in v0.2.0, reopen this issue or create a new one.