freezingdata / snh-userscript-tiktok

SNH-Script to acquire data from TikTok
GNU General Public License v3.0
8 stars 0 forks source link

Latest update to SNH broke data scraping in TikTok #4

Closed drangzt1450 closed 1 year ago

drangzt1450 commented 2 years ago

The latest update to SNH (module patch-v2.0.9.19), seems to have pretty much disabled the TikTok module. The data scraping bar doesn't load anymore. There is no way to do any data scraping now. I have re-installed the modules, including the TikTok module, and restarted it several times and it doesn't load.

Prior to 2.09.19 the data scraping bar would occasionally not load, but you could force it by clicking on a video and then going back to the main account page.

freezingdata commented 1 year ago

Can you please test the current development branch. I've tested it with the current SNH and the current modules, and it works fine. My tested account was: https://www.tiktok.com/@catcats74 (for comparisation)

Greetings Benno

drangzt1450 commented 1 year ago

Will do.


Dr. Marcus Rogers, CISSP, DFCP-F, FAAFS

Fellow CERIAS

Professor/Director Cyberforensics Lab

Dept of Computer & Information Technology

Purdue Polytechnic Institute

Purdue University

Chief Scientist Purdue/Tippecanoe High Tech Crime Unit (HTCU)

Editor-in-Chief Journal of Digital Forensics Security and Law (JDFSL)

https://commons.erau.edu/jdfsl/

P: 765-494-1951

@.***


From: Freezingdata GmbH @.> Sent: Tuesday, November 15, 2022 12:33 AM To: freezingdata/snh-userscript-tiktok @.> Cc: Rogers, Marcus K @.>; Author @.> Subject: Re: [freezingdata/snh-userscript-tiktok] Latest update to SNH broke data scraping in TikTok (Issue #4)

---- External Email: Use caution with attachments, links, or sharing data ----

Can you please test the current development branch. I've tested it with the current SNH and the current modules, and it works fine. My tested account was: @.*** (for comparisation)

Greetings Benno

— Reply to this email directly, view it on GitHubhttps://github.com/freezingdata/snh-userscript-tiktok/issues/4#issuecomment-1314797769, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFAA3MO37RR3MMWH3EXDHVLWIMODDANCNFSM6AAAAAAR25UYF4. You are receiving this because you authored the thread.Message ID: @.***>

drangzt1450 commented 1 year ago

It appears to be working now...thanks.

sh230941 commented 1 year ago

Das Problem besteht bei uns auch weiterhin: Das Datenerfassungsmenü erscheint bei manchen Profilen auf der Hauptseite nicht, erst nach Anklicken einzelner Videos.

Darüber hinaus werden aktuell die Daten von Kommentierenden grundsätzlich nicht in den Data Explorer geladen.

freezingdata commented 1 year ago

Hallo sh230941, danke für die Fehlermeldung. Wir werden uns das Modul anschauen und dann eine Rückmeldung geben.

freezingdata commented 1 year ago

Wir konnten feststellen, dass TikTok mittlerweile die API Requests signiert. Hier müssen wir also zunächst feststellen, wie wir das nachstellen können um das dann ggfls zu implementieren. Wir werden Euch auf dem Laufenden halten.

Was die Erkennung von Profilen betrifft. Hast du die Development Banch des Moduls benutzt? Die Master-Branch war noch veraltet. Ich habe das jetzt upgedated. Jetzt kann auch der Masterbranch wieder genutzt werden: https://github.com/freezingdata/snh-userscript-tiktok/archive/refs/heads/master.zip

Bitte das testen, ob das die Erkennung von TikTok Profilen verbessert.

Viele Grüße Das Team von Freezingdata

drangzt1450 commented 1 year ago

Will do.


Dr. Marcus Rogers, CISSP, DFCP-F, FAAFS

Fellow CERIAS

Professor/Director Cyberforensics Lab

Dept of Computer & Information Technology

Purdue Polytechnic Institute

Purdue University

Chief Scientist Purdue/Tippecanoe High Tech Crime Unit (HTCU)

Editor-in-Chief Journal of Digital Forensics Security and Law (JDFSL)

https://commons.erau.edu/jdfsl/

P: 765-494-1951

@.***


From: Freezingdata GmbH @.> Sent: Monday, November 28, 2022 12:16 PM To: freezingdata/snh-userscript-tiktok @.> Cc: Rogers, Marcus K @.>; Author @.> Subject: Re: [freezingdata/snh-userscript-tiktok] Latest update to SNH broke data scraping in TikTok (Issue #4)

---- External Email: Use caution with attachments, links, or sharing data ----

Wir konnten feststellen, dass TikTok mittlerweile die API Requests signiert. Hier müssen wir also zunächst feststellen, wie wir das nachstellen können um das dann ggfls zu implementieren. Wir werden Euch auf dem Laufenden halten.

Was die Erkennung von Profilen betrifft. Hast du die Development Banch des Moduls benutzt? Die Master-Branch war noch veraltet. Ich habe das jetzt upgedated. Jetzt kann auch der Masterbranch wieder genutzt werden: https://github.com/freezingdata/snh-userscript-tiktok/archive/refs/heads/master.zip

Bitte das testen, ob das die Erkennung von TikTok Profilen verbessert.

Viele Grüße Das Team von Freezingdata

— Reply to this email directly, view it on GitHubhttps://github.com/freezingdata/snh-userscript-tiktok/issues/4#issuecomment-1329460314, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFAA3MPC2I4ROCZVY7JM7ELWKTSHZANCNFSM6AAAAAAR25UYF4. You are receiving this because you authored the thread.Message ID: @.***>

freezingdata commented 1 year ago

The changes of the TikTok signing mechanism had required a complete rewrite of the commend collection part of the TikTok Module.

I recently published a new version of the modul: https://github.com/freezingdata/snh-userscript-tiktok/tree/rewrite-and-captcha-solver I've not joint the new branch into the master branch, because it hasn'tbeen tested enough yet.

Configuration and limits The collection of the comments is now DOM based. This is no problem, if there are a small amount of comments. If you have a post with lots of comments, the usage of the DOM results in a slowdown of the integrated browser. This is the reason, why it is important to set a maximum of collected comments and answers to comments. In the standard configuration the first 100 comments with each max 100 answers can be collected (100/100). If your system is fast enaugh, you can go on with 200/200.

You can set this values in https://github.com/freezingdata/snh-userscript-tiktok/blob/rewrite-and-captcha-solver/tiktok/tiktok_config.py

I'm currently testing the collection of the TikTok Page of Will Smith with the 200/200 setting.

In general, the usage of the DOM is a slower technique, than the usage of an website API.

Captchas I've also added a new feature in the module, to solve the captcha tasks of TikTok. If a Captcha task apperas during the data collection, the modul should solve it automatically.