Open vuhaopro90 opened 9 months ago
hey, try to look through this code ms_token = os.environ.get( "ms_token", None )
. Is your token actually located within your environment?
try to pass directly at first
same here, more than 35 videos fail
I've had mixed luck with trying to scrape large amounts at once. I'll post a modified version of your code adding the changes I've made that seem to work best, though I see a strong positive correlation between failure rate and post count (especially for profiles with >1000 posts).
from TikTokApi import TikTokApi
import asyncio
import os
ms_token = os.environ.get(
"ms_token", None
)
async def user_example(username):
async with TikTokApi() as api:
await api.create_sessions(headless=False, ms_tokens=[ms_token], num_sessions=1, sleep_after=3)
user = api.user(username)
user_data = await user.info()
post_count = user_data["userInfo"]["stats"].get("videoCount")
async for video in user.videos(count=post_count):
print(video)
video = str(video) + "\n"
with open('test1.json', 'a') as file:
file.write(video)
if __name__ == "__main__":
asyncio.run(user_example("truong_se"))
First, I added a username
parameter to your user_example
definition. This allows you to call this function for other usernames in a more streamlined fashion. It also cleans up your code elsewhere, as in user = api.user(username)
. (my own version of this function includes a manual limit: int = 0
in the definition for situations like where a user has a substantial number of non-public posts.)
Next, I added the post_count
variable assignment. This will extract the number of videos from user_data
in the previous line. Note: this number is not always accurate, as it reflects all posts - including private, which we can't scrape.
Finally, I added the username you had to the function call asyncio.run(user_example("truong_se"))
, reflecting the syntax update in the definition.
Hope this is helpful!
I've had mixed luck with trying to scrape large amounts at once. I'll post a modified version of your code adding the changes I've made that seem to work best, though I see a strong positive correlation between failure rate and post count (especially for profiles with >1000 posts).
Code
from TikTokApi import TikTokApi import asyncio import os ms_token = os.environ.get( "ms_token", None ) async def user_example(username): async with TikTokApi() as api: await api.create_sessions(headless=False, ms_tokens=[ms_token], num_sessions=1, sleep_after=3) user = api.user(username) user_data = await user.info() post_count = user_data["userInfo"]["stats"].get("videoCount") async for video in user.videos(count=post_count): print(video) video = str(video) + "\n" with open('test1.json', 'a') as file: file.write(video) if __name__ == "__main__": asyncio.run(user_example("truong_se"))
Breakdown
First, I added a
username
parameter to youruser_example
definition. This allows you to call this function for other usernames in a more streamlined fashion. It also cleans up your code elsewhere, as inuser = api.user(username)
. (my own version of this function includes a manuallimit: int = 0
in the definition for situations like where a user has a substantial number of non-public posts.) Next, I added thepost_count
variable assignment. This will extract the number of videos fromuser_data
in the previous line. Note: this number is not always accurate, as it reflects all posts - including private, which we can't scrape. Finally, I added the username you had to the function callasyncio.run(user_example("truong_se"))
, reflecting the syntax update in the definition.Hope this is helpful!
My goal is to get all the user's video ids, I used my own method, surprisingly it works perfectly, it doesn't miss a single video.
This is my code:
driver = webdriver.Chrome()
user_link = "https://www.tiktok.com/@hieuthuhai2222"
driver.get(user_link)
time.sleep(10)
script = """let id_video = '';
let lastScrollHeight = 0;
let scrollCompleted = false;
while (!scrollCompleted) {
const currentScrollHeight = document.body.scrollHeight;
if (currentScrollHeight === lastScrollHeight) {
const divs = document.querySelectorAll('.css-1as5cen-DivWrapper');
divs.forEach(div => {
const link = div.querySelector('a');
if (link) {
const href = link.getAttribute('href');
if (href.includes("/video/")) {
const id = href.split('/').pop();
id_video += id + ",";
}
}
});
scrollCompleted = true;
} else {
lastScrollHeight = currentScrollHeight;
window.scrollTo(0, currentScrollHeight);
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
return id_video;
"""
ids = driver.execute_script(script)
print(ids)
When I set the count to 35 or higher, the same error occurs.
I've encountered a limitation where I cannot retrieve information for more than 35 videos at once. However, I found a workaround using the cursor
argument in the users.videos
endpoint.
This argument specifies the starting point for the count. By updating cursor
and repeating the process, I can access information for more than 35 videos.
I hope this solution is helpful to you!
Hi @koonn , I tried to use cursor
but it didn't work. At the 35 first videos it worked but when the cursor
start at 35 or higher, it don't return data anymore. Here's how I implemented:
async def get_user_videos(user_name):
async with TikTokApi() as api:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=1, headless=False)
user = api.user(user_name)
count = 35
cursor = 0
while cursor < 350:
print(f"Fetching videos from cursor {cursor}...")
try:
async for video in user.videos(count=count, cursor=cursor):
print(video)
cursor += count
except Exception as e:
print(f"Error: {e}")
break
Have you tried using it? Then can I see your code please? I'd be really appreciate that.
@lhphat02
here is my code.
async def get_hashtag_videos(hash_tag, num_data=30):
videos_data = []
cursor = 0
async with TikTokApi() as api:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False)
tag = api.hashtag(name=hash_tag)
while cursor <= num_data:
async for video in tag.videos(count=30, cursor=cursor):
print(video)
video_data = video.as_dict
videos_data.append(video_data)
cursor += 30
I actually use this solution for tag.videos
, not for users.videos
, and it works well. After checking the implementation for user.videos
, it seems that this solution would work there too.
@koonn
Hi, thanks for your code, it works pretty well. But when I modify the code for crawling user.videos
, it still not work, I think there're some problems with the TikTok's API or their policies about rate limits. Here's my code I modified from yours for user.videos
:
async def get_user_videos(user_name, num_data=300):
result = []
cursor = 0
async with TikTokApi() as api:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False)
user = api.user(username=user_name)
while cursor <= num_data:
print(f"cursor: {cursor}")
async for video in user.videos(count=30, cursor=cursor):
print(video)
video_data = video.as_dict
result.append(video_data)
cursor += 30
return result
Btw thanks for your solution. Have a nice day!
@koonn Hi, thanks for your code, it works pretty well. But when I modify the code for crawling
user.videos
, it still not work, I think there're some problems with the TikTok's API or their policies about rate limits. Here's my code I modified from yours foruser.videos
:async def get_user_videos(user_name, num_data=300): result = [] cursor = 0 async with TikTokApi() as api: await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) user = api.user(username=user_name) while cursor <= num_data: print(f"cursor: {cursor}") async for video in user.videos(count=30, cursor=cursor): print(video) video_data = video.as_dict result.append(video_data) cursor += 30 return result
Btw thanks for your solution. Have a nice day!
Which version of TikTokApi are you using?
Installing v6.2.2 broke user.videos() for me. Downgrading to 6.2.0 with pip install TikTokApi==6.2.0 --force-reinstall
fixed the issue.
@koonn Hi, thanks for your code, it works pretty well. But when I modify the code for crawling
user.videos
, it still not work, I think there're some problems with the TikTok's API or their policies about rate limits. Here's my code I modified from yours foruser.videos
:async def get_user_videos(user_name, num_data=300): result = [] cursor = 0 async with TikTokApi() as api: await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) user = api.user(username=user_name) while cursor <= num_data: print(f"cursor: {cursor}") async for video in user.videos(count=30, cursor=cursor): print(video) video_data = video.as_dict result.append(video_data) cursor += 30 return result
Btw thanks for your solution. Have a nice day!
Which version of TikTokApi are you using?
Installing v6.2.2 broke user.videos() for me. Downgrading to 6.2.0 with
pip install TikTokApi==6.2.0 --force-reinstall
fixed the issue.
I personally downgraded and I could fetch more than 30+ videos per account without issues!
EDIT: wording
cursor: 0 TikTokApi.video(id='7364804047850884398') TikTokApi.video(id='7374084684180901163') TikTokApi.video(id='7370409071545044267') TikTokApi.video(id='7369999192237935914') TikTokApi.video(id='7369630891267738922') TikTokApi.video(id='7367034743424142634') TikTokApi.video(id='7366660218496830762') TikTokApi.video(id='7361095320073211178') TikTokApi.video(id='7360015406045711662') TikTokApi.video(id='7356640627938954539') TikTokApi.video(id='7353301578083863839') TikTokApi.video(id='7343713119836949791') TikTokApi.video(id='7341070239158799646') TikTokApi.video(id='7340345650963434782') TikTokApi.video(id='7336647877168614686') TikTokApi.video(id='7333029200418426143') TikTokApi.video(id='7312884877811109166') TikTokApi.video(id='7309179805361179950') TikTokApi.video(id='7307330647427665195') TikTokApi.video(id='7298050264441818410') TikTokApi.video(id='7296168512408603947') TikTokApi.video(id='7292088806361107755') TikTokApi.video(id='7291000957708602666') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289125744494759214') TikTokApi.video(id='7285778438940560686') TikTokApi.video(id='7280583185115647263') TikTokApi.video(id='7277986355668438314') TikTokApi.video(id='7270193233441885486') TikTokApi.video(id='7244634709748174126') TikTokApi.video(id='7219372552878116139') cursor: 30 cursor: 60 cursor: 90 cursor: 120 help me import os import asyncio from TikTokApi import TikTokApi
ms_token = os.environ.get("ms_token", None) # Set your own ms_token
async def get_user_videos(user_name, num_data=1): result = [] cursor = 0
async with TikTokApi() as api:
await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False)
user = api.user(username=user_name)
while cursor <= num_data:
print(f"cursor: {cursor}")
async for video in user.videos(count=30, cursor=cursor):
print(video)
video_data = video.as_dict
result.append(video_data)
cursor += 30
return result
def run_async_function(async_func, *args, kwargs): loop = asyncio.get_event_loop() if loop.is_running(): return asyncio.ensure_future(async_func(*args, *kwargs)) else: return loop.run_until_complete(async_func(args, kwargs))
if name == "main": user_name = "mrbeast" videos = run_async_function(get_user_videos, user_name, 19000) print(videos) its not giving pagination videos even cursor is wrong
cursor: 0 TikTokApi.video(id='7364804047850884398') TikTokApi.video(id='7374084684180901163') TikTokApi.video(id='7370409071545044267') TikTokApi.video(id='7369999192237935914') TikTokApi.video(id='7369630891267738922') TikTokApi.video(id='7367034743424142634') TikTokApi.video(id='7366660218496830762') TikTokApi.video(id='7361095320073211178') TikTokApi.video(id='7360015406045711662') TikTokApi.video(id='7356640627938954539') TikTokApi.video(id='7353301578083863839') TikTokApi.video(id='7343713119836949791') TikTokApi.video(id='7341070239158799646') TikTokApi.video(id='7340345650963434782') TikTokApi.video(id='7336647877168614686') TikTokApi.video(id='7333029200418426143') TikTokApi.video(id='7312884877811109166') TikTokApi.video(id='7309179805361179950') TikTokApi.video(id='7307330647427665195') TikTokApi.video(id='7298050264441818410') TikTokApi.video(id='7296168512408603947') TikTokApi.video(id='7292088806361107755') TikTokApi.video(id='7291000957708602666') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289125744494759214') TikTokApi.video(id='7285778438940560686') TikTokApi.video(id='7280583185115647263') TikTokApi.video(id='7277986355668438314') TikTokApi.video(id='7270193233441885486') TikTokApi.video(id='7244634709748174126') TikTokApi.video(id='7219372552878116139') cursor: 30 cursor: 60 cursor: 90 cursor: 120 help me import os import asyncio from TikTokApi import TikTokApi
ms_token = os.environ.get("ms_token", None) # Set your own ms_token
async def get_user_videos(user_name, num_data=1): result = [] cursor = 0
async with TikTokApi() as api: await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) user = api.user(username=user_name) while cursor <= num_data: print(f"cursor: {cursor}") async for video in user.videos(count=30, cursor=cursor): print(video) video_data = video.as_dict result.append(video_data) cursor += 30 return result
def run_async_function(async_func, *args, kwargs): loop = asyncio.get_event_loop() if loop.is_running(): return asyncio.ensure_future(async_func(*args, *kwargs)) else: return loop.run_until_complete(async_func(args, kwargs))
Running the async function
if name == "main": user_name = "mrbeast" videos = run_async_function(get_user_videos, user_name, 19000) print(videos) its not giving pagination videos even cursor is wrong
Do this pip install TikTokApi==6.2.0 --force-reinstall
. This seems to be the most stable solution so far.
The recent version makes you use an older version of chromium and you would need to pass it in as a parameter to the "create_session" method. For me personally, some accounts were just simply not working or i got a return code of "shadow banned".
Using version 6.2.0 did not give me any errors on the accounts that were supposedly "shadow banned".
EDIT: wording
Why we are passing cursor 31 etc Cursor is something long integer string
On Wed, Jun 5, 2024 at 6:05 PM Gereks123 @.***> wrote:
cursor: 0 TikTokApi.video(id='7364804047850884398') TikTokApi.video(id='7374084684180901163') TikTokApi.video(id='7370409071545044267') TikTokApi.video(id='7369999192237935914') TikTokApi.video(id='7369630891267738922') TikTokApi.video(id='7367034743424142634') TikTokApi.video(id='7366660218496830762') TikTokApi.video(id='7361095320073211178') TikTokApi.video(id='7360015406045711662') TikTokApi.video(id='7356640627938954539') TikTokApi.video(id='7353301578083863839') TikTokApi.video(id='7343713119836949791') TikTokApi.video(id='7341070239158799646') TikTokApi.video(id='7340345650963434782') TikTokApi.video(id='7336647877168614686') TikTokApi.video(id='7333029200418426143') TikTokApi.video(id='7312884877811109166') TikTokApi.video(id='7309179805361179950') TikTokApi.video(id='7307330647427665195') TikTokApi.video(id='7298050264441818410') TikTokApi.video(id='7296168512408603947') TikTokApi.video(id='7292088806361107755') TikTokApi.video(id='7291000957708602666') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289861757915417899') TikTokApi.video(id='7289125744494759214') TikTokApi.video(id='7285778438940560686') TikTokApi.video(id='7280583185115647263') TikTokApi.video(id='7277986355668438314') TikTokApi.video(id='7270193233441885486') TikTokApi.video(id='7244634709748174126') TikTokApi.video(id='7219372552878116139') cursor: 30 cursor: 60 cursor: 90 cursor: 120 help me import os import asyncio from TikTokApi import TikTokApi
ms_token = os.environ.get("ms_token", None) # Set your own ms_token
async def get_user_videos(user_name, num_data=1): result = [] cursor = 0
async with TikTokApi() as api: await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3, headless=False) user = api.user(username=user_name)
while cursor <= num_data: print(f"cursor: {cursor}") async for video in user.videos(count=30, cursor=cursor): print(video) video_data = video.as_dict result.append(video_data) cursor += 30
return result
def run_async_function(async_func, *args, kwargs): loop = asyncio.get_event_loop() if loop.is_running(): return asyncio.ensure_future(async_func(*args, *kwargs)) else: return loop.run_until_complete(async_func(args, kwargs)) Running the async function
if name == "main": user_name = "mrbeast" videos = run_async_function(get_user_videos, user_name, 19000) print(videos) its not giving pagination videos even cursor is wrong
Do this pip install TikTokApi==6.2.0 --force-reinstall. This seems to be the most stable solution so far.
The recent version makes you use an older verision of chromium and you would need to pass it in as a parameter to the "create_session" method. For me personally, some accounts were just simply not working or i got a return code of "shadow banned".
Using bersion 6.2.0 did not give me any errors on the accounts that were supposedly "shadow banned".
— Reply to this email directly, view it on GitHub https://github.com/davidteather/TikTok-Api/issues/1119#issuecomment-2149829934, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPI5XPFEUNM7SRK53P65S3ZF4EJ5AVCNFSM6AAAAABDXVHIOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBZHAZDSOJTGQ . You are receiving this because you commented.Message ID: @.***>
I can confirm pagination is not working in versions 6.3.0 or 6.2.0 for api.hashtag. Looking at run_fetch_script
, as long as the cursor is not 0 in the URL i.e. in variable js_script
, the query returns no results. In other words, it only works if cursor=0
is part of the URL. The above fix using the cursor is not working either. Does anyone have any ideas?
@lhphat02 @koonn
cursor = int(time() * 1000)
while True:
videos = [video async for video in user.videos(count=30, cursor=cursor)]
if not videos:
break
# .... your code ...
cursor = int(videos[-1].create_time.timestamp() * 1000)
my 2 cents: the api is kinda badly designed. if you went with the async iterator route then the iterator should hold state (i.e. the cursor) and not expose it to the user. only have the batch size as a parameter.
cursor = int(time() * 1000) while True: videos = [video async for video in user.videos(count=30, cursor=cursor)] if not videos: break # .... your code ... cursor = int(videos[-1].create_time.timestamp() * 1000)
my 2 cents: the api is kinda badly designed. if you went with the async iterator route then the iterator should hold state (i.e. the cursor) and not expose it to the user. only have the batch size as a parameter.
Agreed, having cursor
exposed to the user adds unnecessary complexity imo. I know that design decision predates the move to async generator (because I've been hiding cursor
in my implementation since before the move to async for this reason, lol). Maybe it got missed when the move took place?
Hi everyone, I want to get all tiktok video id from a user, this is code:
in this line, I try edit count=30 to count=1000 but it doesn't work:
output:
2024-02-24 09:39:55,961 - TikTokApi.tiktok - ERROR - Got an unexpected status code: {'log_pb': {'impr_id': '202402240240321EB88FFC04941C07B40E'}, 'statusCode': 10201, 'statusMsg': '', 'status_code': 10201, 'status_msg': ''}
Is there any solution to fix this situation? Thanks.