Closed BillyBSig closed 9 months ago
I ran your code and it worked fine for me. Maybe upgrade your Python to 3.10? That's what I'm using.
I have tried with python 3.10 and 3.11 but the results are still the same, 'NoneType' object has no attribute 'keys'.
I dont get any result in get_tiktok_json function
because return None in
soup.find('script', attrs={'id':"SIGI_STATE"})
Have you ever opened TikTok on Firefox? If not, maybe you don't have the necessary cookies?
That looks like a cookie issue to me. I was going to suggest trying another browser on your system, again ensuring you've opened TikTok on it at least once.
I'm also empiercing the same issue I'm running on python 3.10. tried it on all 3 browsers i.e. chrome, Firefox and Edge
Do you have a TT account and is the browser on or off when you run the program? Also please list your OS and version.
Have you ever opened TikTok on Firefox? If not, maybe you don't have the necessary cookies?
yes, i have opened link Tiktok on firefox and chrome
Do you have a TT account and is the browser on or off when you run the program? Also please list your OS and version.
yes i have, I tried it both with TT account or without sign in my TT account, and I open the browser when run the code. I'm using Ubuntu 20.04.6 LTS.
I have run the code separately, and this is the result
OK, please try the following troubleshooting steps:
SIGI_STATE
Also please try the code using pyk.specify_browser("chrome")
rather than pyk.specify_browser("firefox")
if you haven't already. Please report whether it returns the same error or a different one.
Finally I would try it out on an non-Ubuntu OS to see if that's the issue. I know Pyktok works on Mac and Windows but don't have access to Ubuntu so not sure what the issue would be there.
I found the problem.
SIGI_STATE
is not in my video page source., either in firefox and chrome, and i tried this in my OS windows also.
my video page's source uses __UNIVERSAL_DATA_FOR_REHYDRATION__
, I don't know why it's different from yours.
The structure and key names on the video page source had also changed, so I couldn't use the existing code.
alternatively, I need to make some change to the code.
##similar in pyktok code
def get_tiktok_json(video_url,browser_name=None):
if 'cookies' not in globals() and browser_name is None:
raise BrowserNotSpecifiedError
global cookies
if browser_name is not None:
cookies = getattr(browser_cookie3,browser_name)(domain_name='www.tiktok.com')
tt = requests.get(video_url,
headers=headers,
cookies=cookies,
timeout=20)
# retain any new cookies that got set in this request
cookies = tt.cookies
soup = BeautifulSoup(tt.text, "html.parser")
tt_script = soup.find('script', attrs={'id':"SIGI_STATE"})
try:
tt_json = json.loads(tt_script.string)
except AttributeError:
print("The function encountered a downstream error and did not deliver any data, which happens periodically for various reasons. Please try again later.")
return
return tt_json
##alternative get_tiktok_json
def alt_get_tiktok_json(video_url,browser_name=None):
if 'cookies' not in globals() and browser_name is None:
raise BrowserNotSpecifiedError
global cookies
if browser_name is not None:
cookies = getattr(browser_cookie3,browser_name)(domain_name='www.tiktok.com')
tt = requests.get(video_url,
headers=headers,
cookies=cookies,
timeout=20)
# retain any new cookies that got set in this request
cookies = tt.cookies
soup = BeautifulSoup(tt.text, "html.parser")
tt_script = soup.find('script', attrs={'id':"__UNIVERSAL_DATA_FOR_REHYDRATION__"})
try:
tt_json = json.loads(tt_script.string)
except AttributeError:
print("The function encountered a downstream error and did not deliver any data, which happens periodically for various reasons. Please try again later.")
return
return tt_json
##save_tiktok adding condition
def save_tiktok(video_url,
save_video=True,
metadata_fn='',
browser_name=None):
if 'cookies' not in globals() and browser_name is None:
raise BrowserNotSpecifiedError
if save_video == False and metadata_fn == '':
print('Since save_video and metadata_fn are both False/blank, the program did nothing.')
return
tt_json = get_tiktok_json(video_url,browser_name)
##check if tt_json not None by using get_tiktok_json
if tt_json is not None:
video_id = list(tt_json['ItemModule'].keys())[0]
if save_video == True:
regex_url = re.findall(url_regex, video_url)[0]
if 'imagePost' in tt_json['ItemModule'][video_id]:
slidecount = 1
for slide in tt_json['ItemModule'][video_id]['imagePost']['images']:
video_fn = regex_url.replace('/', '_') + '_slide_' + str(slidecount) + '.jpeg'
tt_video_url = slide['imageURL']['urlList'][0]
headers['referer'] = 'https://www.tiktok.com/'
# include cookies with the video request
tt_video = requests.get(tt_video_url, allow_redirects=True, headers=headers, cookies=cookies)
with open(video_fn, 'wb') as fn:
fn.write(tt_video.content)
slidecount += 1
else:
regex_url = re.findall(url_regex, video_url)[0]
video_fn = regex_url.replace('/', '_') + '.mp4'
tt_video_url = tt_json['ItemModule'][video_id]['video']['downloadAddr']
headers['referer'] = 'https://www.tiktok.com/'
# include cookies with the video request
tt_video = requests.get(tt_video_url, allow_redirects=True, headers=headers, cookies=cookies)
with open(video_fn, 'wb') as fn:
fn.write(tt_video.content)
print("Saved video\n", tt_video_url, "\nto\n", os.getcwd())
if metadata_fn != '':
data_slot = tt_json['ItemModule'][video_id]
data_row = generate_data_row(data_slot)
try:
user_id = list(tt_json['UserModule']['users'].keys())[0]
data_row.loc[0,"author_verified"] = tt_json['UserModule']['users'][user_id]['verified']
except Exception:
pass
if os.path.exists(metadata_fn):
metadata = pd.read_csv(metadata_fn,keep_default_na=False)
combined_data = pd.concat([metadata,data_row])
else:
combined_data = data_row
combined_data.to_csv(metadata_fn,index=False)
print("Saved metadata for video\n",video_url,"\nto\n",os.getcwd())
##This is using alt_get_tiktok_json
else:
tt_json = alt_get_tiktok_json(video_url,browser_name)
regex_url = re.findall(url_regex, video_url)[0]
video_fn = regex_url.replace('/', '_') + '.mp4'
tt_video_url = tt_json["__DEFAULT_SCOPE__"]['webapp.video-detail']['itemInfo']['itemStruct']['video']['downloadAddr']
headers['referer'] = 'https://www.tiktok.com/'
# include cookies with the video request
tt_video = requests.get(tt_video_url, allow_redirects=True, headers=headers, cookies=cookies)
with open(video_fn, 'wb') as fn:
fn.write(tt_video.content)
if metadata_fn != '':
data_slot = tt_json["__DEFAULT_SCOPE__"]['webapp.video-detail']['itemInfo']['itemStruct']
data_row = generate_data_row(data_slot)
try:
user_id = list(tt_json['UserModule']['users'].keys())[0]
data_row.loc[0,"author_verified"] = tt_json["__DEFAULT_SCOPE__"]['webapp.video-detail']['itemInfo']['itemStruct']['author']
except Exception:
pass
if os.path.exists(metadata_fn):
metadata = pd.read_csv(metadata_fn,keep_default_na=False)
combined_data = pd.concat([metadata,data_row])
else:
combined_data = data_row
combined_data.to_csv(metadata_fn,index=False)
print("Saved metadata for video\n",video_url,"\nto\n",os.getcwd())
Glad you found a solution. If you have time, may I suggest adding a pull request so I can update Pyktok? If not, I'll do it when time permits. I will of course credit you on the main page. Anyone else seeing this can also feel free to copy the code and do a PR.
Can I also ask what country you live in? I'm guessing the difference in the script IDs might have something to do with that.
@BillyBSig, does this actually work with the data in __UNIVERSAL_DATA_FOR_REHYDRATION__
:
video_id = list(tt_json['ItemModule'].keys())[0]
tt_json['ItemModule'][video_id]['video']['downloadAddr']`.
I always get both IDs, but only the data in SIGI_STATE
is actually useful (I observed this in several countries in Europe).
Glad you found a solution. If you have time, may I suggest adding a pull request so I can update Pyktok? If not, I'll do it when time permits. I will of course credit you on the main page. Anyone else seeing this can also feel free to copy the code and do a PR.
Sure, I have added a new pull request, thanks for your time to review it
Can I also ask what country you live in? I'm guessing the difference in the script IDs might have something to do with that.
I'm living in Indonesia, yes I guess so, maybe some regions have different scripts
@BillyBSig, does this actually work with the data in
__UNIVERSAL_DATA_FOR_REHYDRATION__
:video_id = list(tt_json['ItemModule'].keys())[0] tt_json['ItemModule'][video_id]['video']['downloadAddr']`.
unfortunately no, data in __UNIVERSAL_DATA_FOR_REHYDRATION__
has different structure, and there is no ItemModule
in it.
I always get both IDs, but only the data in
SIGI_STATE
is actually useful (I observed this in several countries in Europe).
Last month i tried with SIGI_STATE
and everything worked fine, but suddenly the script changed to __UNIVERSAL_DATA_FOR_REHYDRATION__
. Maybe there are differences script in some regions.
Merged in the PR implementing the fix for this.
Hi, do you have solution for this iissue?
import pyktok as pyk pyk.specify_browser("firefox") pyk.save_tiktok("https://www.tiktok.com/@vantoan___/video/7294298719665622305?is_from_webapp=1&sender_device=pc", True, 'video_data.csv', browser_name = "firefox")