Closed vsyw closed 4 years ago
well they've probably changed signature method
Yes, I think so. Do you have any idea on this? thank you
have a look at this: https://sf16-muse-va.ibytedtos.com/obj/rc-web-sdk-gcs/acrawler.js and this
window.byted_acrawler.init({
aid: 1988,
dfp: false,
boe: false,
intercept: true,
enablePathList: ["/api/user/detail", "/api/music/detail", "/api/item/detail", "/api/challenge/detail/", "/share/item/list", "/api/item_list/", "/api/comment/list/", "/api/comment/list/reply/", "/api/discover/user/", "/api/commit/follow/user/", "/api/recommend/user/", "/api/impression/write/", "/share/item/explore/list", "/api/commit/item/digg/", "/node/share/*", "/discover/render/*"],
});
Hi @thibaultleouay, thank you for sharing. But how exactly to use it?
They just moved away tac signature
They are using acrawler.js to generate the signature
it might be similar to what they are doing on douyin see
Brrrr. old info, since 2019 lots of things changed. Please do not rush to post any fast conclusions
:)
I had a quick look but as I've no clue what Tac
is I don't think I can help :(
This code looks very cryptic to me.
I suppose you're somehow faking an authentification or something?
Also experiencing this issue. Only getting it when scraping info on posts though, still able to get a user's profile information... I'm assuming they just operate differently?
This guy seems to have a solution that works: https://github.com/carcabot/tiktok-signature
Updated 4 hours ago. Might be worth taking a look at the changes he just made
@ThibaultJanBeyer Tac value served as a "secret/key" in signature generating process. Before it was rendered in the web page it self. @thibaultleouay is right about the way signature is being generated right now. I have implemented it from mine side, but not luck yet, signature is bad
I also have was able to generate valid tac but it only works with the new endpoints
@sharif9876 is using puppeteer and that is way too "heavy". This is the "last resort" solution
@drawrowfly Not sure if this helps you, but if you inspect TikTok's webpage, they call this file: https://sf16-muse-va.ibytedtos.com/obj/rc-web-sdk-gcs/acrawler.js
If you run that code in your own project, it exposes a new byted_acrawler.sign()
method that you can pass the URL to. That is what the tiktok-signature
package is using to generate a signature.
He is using puppeteer but his signature generation code is still valid here: https://github.com/carcabot/tiktok-signature/blob/master/index.js#L82
"I have implemented it from mine side, but not luck yet, signature is bad"
still working on it
I'm trying to inject the new script with jsdom on my side too.
the other TikTok signature repo seems to also have an issue with the old endpoint too
Patience is very important. Love challenges :)
thanks, wait for resolve this issue
Hi folks!! I am also investigating. Do you guys know what are these red dot characters in acrawler.js script ?
Hi folks!! I am also investigating. Do you guys know what are these red dot characters in acrawler.js script ?
Seems to be ASCII code 16 (Data Line Escape) Probably just extra space the obfuscator add into / shrunk into, shouldn't have any impact on logic of the code.
I have a solution that allows calling byted_acrawler.sign()
in Node environment natively, without puppeteer. Let me know if that helps.
I did this yesterday through jsdom, but signature is invalid
And if someone will have a solution to any problem, do not ask if we need it. Test it and post the solution here or submit pull request !
Sorry, I just started working with TikTok yesterday, I don't know anything about it and I am just working out of experience with Instagram. I didn't know if you already know that solution (but don't like it or whatever) so I though I'd ask before posting.
So my solution is like this: I took acrawler.js and executed it in Node.js, which showed me that the script populates the global scope with 3 properties: TAC, oprand and byted_acrawler:
szokeptr@Peters-Mac-Pro /tmp
% node ./acrawler.js !1847
[
'global',
'clearInterval',
'clearTimeout',
'setInterval',
'setTimeout',
'queueMicrotask',
'clearImmediate',
'setImmediate',
'TAC',
'oprand',
'byted_acrawler'
]
The byted_acrawler
object has a method sign
that will error if called in node environment since it relies on some browser-only APIs like navigator
and location
. If you manually provide these, the method can be called and a correct signature is made:
… contents of acrawler.js …
// Populate the global scope with missing APIs
location = {
href: 'https://www.tiktok.com/',
protocol: 'https:'
};
navigator = {
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'
};
function signUrl(url) {
return byted_acrawler.sign({ url });
}
I tested this with the following endpoints:
https://m.tiktok.com/api/item_list/
https://www.tiktok.com/node/share/user/@username
Forgot to mention: the signed request should have the same UA as the one used to sign the request, no other headers are required (only UA).
The signature won't be valid
Actually it is valid in some way
The server accepts it with the endpoints I listed above.
I've tested with the old api. It actually accepts 1 request out of 3 . Weird. Anyway i will update repo shortly
@szokeptr Although your solution works for some endpoints, other endpoints require more complex properties (createElement, HTMLCanvasElement, GPUINFO, plugins) in order to generate a signature; you would need to implement those properties into the global scope as well, which is a hassle. It seems like some jsdom implementation is the solution to this, but I have not been able to generate any valid signatures for an endpoint other than the trending page.
i will push working version within few hours
@EmilioBarradas I only need the two mentioned endpoints at this moment, but I will continue investigating since it was a lot of fun 😃
well working in new version ! wow @drawrowfly
@szokeptr the only thing i have learned from the tiktok that there are to many bugs and daily updates, that mess-up with all tests. If you didn't mentioned that your tests are working , i would've abandon this way as it didn't worked for me before!!!
Cheers!
PS
New update was pushed But there is But: 1.it is a bit ugly code for now 2.Tests require update 3.Scraping data from the hashtag feed won't be stable as i wasn't able to find the new endpoint for the hashtag feed as i did with the music feed for example
@szokeptr I am wondering, how did you figure out that global vars location
and navigator
were needed?
This is similar to previous signature method . They using user agent to bind signature to the browser
@charlyBerthet since I knew that the sign method worked perfectly in browser but not in Node env, I created recursive mocks of the browser APIs in the global scope using the awesome JS Proxy and logged all property access attempts.
@drawrowfly I see that you implemented something similar to what I mentioned. Does that mean that this works for more than those 2 endpoints? (I don’t actively use your module, so I am not sure how everything works)
@szokeptr Just tested with '/api/user/detail/' and '/api/music/detail/' and it doesn't seem like it does :(
Edit: For context, this is what I am using: test.js
@szokeptr
Basically all Web API endpoints that require the signature
Old API endpoints can be signed but it won't work the way you expect it to , it will randomly return empty response, it means that something is wrong from their side :)
@drawrowfly What are the new endpoints?
Old API endpoints can be signed but it won't work the way you expect it to , it will randomly return empty response, it means that something is wrong from their side :)
So TikTok has a bug in their code now and you expect it to be resolved soon? :D How do you know it's not some anti-scraping shit?
The only "anti-scraping shit" is captcha
And scraper is working fine expect hashtag feed!
And it's not a bug, it's "TikTok" :) . There was lots of times when API behavior was random
Every few weeks they update UI and that can affect the API
@EmilioBarradas just the usual stuff, the ones that are used right now in TikTok Web, you can explore them through inspector
u're awesome bro @drawrowfly ..
I see thanks fo the clarification, great work! :)
@drawrowfly Thanks for your great tool.
And scraper is working fine expect hashtag feed!
So did you find any hints to configure the hashtag feed?
@drawrowfly Thanks for your great tool.
And scraper is working fine expect hashtag feed!
So did you find any hints to configure the hashtag feed?
Which is the more similar invocation method to scrape generic posts? Trending? Thank you in advance
Which is the more similar invocation method to scrape generic posts? Trending? Thank you in advance
@geco what do you mean? I can't get you on this. Please explain more. My questions is that at the moment the search for hashtag is not stable. We need to scrape 3 times at least or more to fetch the list of media objects tagged with a specify hashtag.
I see. There is way using "search for hashtag" (3 times or what we need to have results each), but having new results at every invocation? Now i continue getting always same results.
Error message: "Can't extract Tac value" when use