Russell-Newton / TikTokPy

Extract data from TikTok without needing any login information or API keys.
https://pypi.org/project/tiktokapipy/
MIT License
192 stars 24 forks source link

[BUG] Scrolling is not working anymore #36

Closed bbeyrie closed 1 year ago

bbeyrie commented 1 year ago

Describe the bug It tries to scroll but page loads nothing, as if Tiktok had detected an automatism. I tried Firefox & Chromium.

To Reproduce

from tiktokapipy.api import TikTokAPI
with TikTokAPI(scroll_down_time=10) as api:
        challenges = api.challenge("test")      

Expected behavior It should scroll down

Version Information 0.1.11

Additional context I add a similar problem when scraping TikTok with Selenium without an undetected webdriver and trying to scroll down.

bbeyrie commented 1 year ago

Fail

JeanBG commented 1 year ago

I'm having the same problem. How did you fix it?

bbeyrie commented 1 year ago

Well Tiktok maybe softbanned your IP (so try a proxy) or check this https://github.com/Russell-Newton/TikTokPy/issues/33#issuecomment-1448708382 (VM/Container/Server part) The scrolling is working again on local, but I used temporarily https://github.com/seleniumbase/SeleniumBase to scroll and retrieve all videos in a challenge/hashtag

JeanBG commented 1 year ago

Thank you!

RenHong-HC commented 1 year ago

@bbeyrie Hello, may I ask if you have solved the problem that you can't get more data?

bbeyrie commented 1 year ago

@RenHong-HC Sadly not, the scrolling in Azure doesn't work for me, I did a screen capture to see what happen but can't manage to solve it (seems components are not loading) .... dat

On local behind proxy tho I use SeleniumBase with the integrated Chrome Devtool Protocol, but it's a DIY thing:

from seleniumbase import Driver
driver = Driver(uc=True, headless=False, uc_cdp_events=True)
events, contents = [], []
driver.add_cdp_listener("Network.requestWillBeSentExtraInfo", lambda data: events.append(data)
                            if "api/challenge/item_list/" in data.get('params',{}).get('headers',{}).get(':path','') else None)
driver.get(f"https://www.tiktok.com/tag/{challenge}")
while True:
  ###CODE TO SCROLL DOWN
  for x in events:
      contents = contents + [json.loads(driver.execute_cdp_cmd('Network.getResponseBody',
          {
              "requestId": f"{x.get('params',{}).get('requestId')}",
          }).get('body',{}))]
      events.remove(x)    
  ###if bottom:
    ###break