Russell-Newton / TikTokPy

Extract data from TikTok without needing any login information or API keys.
https://pypi.org/project/tiktokapipy/
MIT License
214 stars 28 forks source link

Cannot grab comments #9

Closed packoverflowruss closed 1 year ago

packoverflowruss commented 1 year ago

UserWarning: Was unable to collect comments. A second attempt might work.

from tiktokapipy.api import TikTokAPI

video_url = "https://www.tiktok.com/@lindamaittm/video/7183294139855850794"

with TikTokAPI(navigation_retries=2, navigation_timeout=10, scroll_down_time=60) as api:
    video = api.video(video_url)
    print(video.comments)
Russell-Newton commented 1 year ago

Looking more into this issue, it looks like for whatever reason, when a Playwright chromium browser opens up TIkTok, only the request to collect comments from TikTok's API results in a CORS error. I'll keep looking into this.

packoverflowruss commented 1 year ago

Looking more into this issue, it looks like for whatever reason, when a Playwright chromium browser opens up TIkTok, only the request to collect comments from TikTok's API results in a CORS error. I'll keep looking into this.

Thank you! Appreciate it.

terrok9 commented 1 year ago

Hello! I am having the same issue. Can I have an update of the current workaround? Maybe I can help. Have a nice day.

Russell-Newton commented 1 year ago

Its looking like this might be an issue with either Chromium or Playwright.

In a vanilla instance of Playwright v1.29.0, try doing:

from playwright.sync_api import sync_playwright

playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=False)
page = browser.new_page()

Then navigate in the page to any TikTok video. Keep an eye on the dev console and the network tab.

You should be able to see that an API request is rejected due to a CORS error. Specifically, this is the request that grabs comments. For whatever reason, this one specific request is blocked on the Playwright browser instance.

I'm going to try different browser versions that come with Playwright as well as different versions of Playwright to see if it works in any of them.

Edit: It works in Playwright v1.29.0 only wtih playwright.firefox.launch. Chromium and WebKit don't seem to work. Chromium also doesn't work in v1.26

terrok9 commented 1 year ago

I see! Thanks for the update. I will try to modify the package and maybe do a pull request with my workaround. I am thinking in calling playwright.firefox.launch when passed on top of the API class.

Furthermore, I will research the behaviour of the API in Chromium. Maybe other installation of Chromium should work too or opening to a test Tiktok API endpoint.

Russell-Newton commented 1 year ago

Good idea. The place to update would be in __enter__ in the sync API and __aenter__ in the async API.

terrok9 commented 1 year ago

I added playwright firefox support in order to solve this issue just for now #17.

Russell-Newton commented 1 year ago

Fixed in v0.1.9.post1