Closed CapitanLiteral closed 3 years ago
I fill the auth.json but when I start the script it says that I should refresh the page. Does this happen to anyone else? Is this a bug or I'm doing something wrong?
Just started doing it for me again also. Hold tight, it will get looked at.
The fourth part of the sign that's changing - previously 608c48da and now 608fcf2c - this looks like a hexidecimal version of an epoch timestamp in seconds. I don't have the previous vendor.js to look back at, but this is what I've observed from the new file:
- 608fcf2c converted to decimal and then analyzed as a timestamp is 2021-05-03 10:23:40 UTC
- The vendor.js full file name from Chrome developer tools is vendor.js?rev=202105031022-6ba1f1c47b
It looks like the revision number and the hex code that is being used are similar but off by a minute, so maybe this is just a coincidence. Just an observation.
I was going to say if 4: is the day and :608 is the revision then it might be dynamically generated?
The first number so far does seem to be incrementing, but is it by day? It seems the 3: lasted both Saturday and Sunday.
Yeah, but on the 3rd they changed the number to 4 - maybe today they'll change it to 5.
This just happened, lol
@DIGITALCRIMINAL, @hippothon we back at it again.
4 = 5 608fcf2c = 6091065f yoCCrPVmrN27vlUEQLZcW3DZH97KRVoy = Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5
need a new checksum :(
@DIGITALCRIMINAL, @hippothon we back at it again.
4 = 5 608fcf2c = 6091065f yoCCrPVmrN27vlUEQLZcW3DZH97KRVoy = Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5
need a new checksum :(
You called it brother.
Well at least we know it's automatically updated every 24 hours and not some guy on a computer updating it manually lmao.
Yep, it went again
I think they are detcting use of this tool somehow
Yep, it went again
I think they are detcting use of this tool somehow
They aren't detecting that people are using it, they are probably aware of this page and have their own copy of the scraper and are testing it to see what breaks it, how it gets fixed, how quick it gets fixed etc etc.
So is it borked for the time being? I've tried the updated login thingo (I'm real good with the technical lingo), it eventually lets me log in through the geckodriver browser but I then just get:
Scraping Paid Content Scraping Subscriptions There's nothing to scrape. Archive Completed in 2.23 Minutes
So is it borked for the time being? I've tried the updated login thingo (I'm real good with the technical lingo), it eventually lets me log in through the geckodriver browser but I then just get:
Scraping Paid Content Scraping Subscriptions There's nothing to scrape. Archive Completed in 2.23 Minutes
Try setting your max_threads to 1 and not -1
Already done and no dice. Patience is a virtue I guess, and I am just a grateful patient pleb.
I have an idea, what if we run selenium wire just to get the correct header?
I mean for the generation of sign we use selenium, with selenium wire all is easy interceptable.
I think this can be done, but on what parameters depends sign?
I have an idea, what if we run selenium wire just to get the correct header?
I mean for the generation of sign we use selenium, with selenium wire all is easy interceptable.
I think this can be done, but on what parameters depends sign?
Sounds like a good idea but the problem is for the people who aren't tech savvy and can't get all these things going. Just a thought you know?
I have an idea, what if we run selenium wire just to get the correct header? I mean for the generation of sign we use selenium, with selenium wire all is easy interceptable. I think this can be done, but on what parameters depends sign?
Sounds like a good idea but the problem is for the people who aren't tech savvy and can't get all these things going. Just a thought you know?
Forgive my ignorance, but how would selenium help in getting the header for the request? It is computed by OnlyFans site itself and changing everyday. How would we extract the computing function from their source code just by using selenium?
I have an idea, what if we run selenium wire just to get the correct header? I mean for the generation of sign we use selenium, with selenium wire all is easy interceptable. I think this can be done, but on what parameters depends sign?
Sounds like a good idea but the problem is for the people who aren't tech savvy and can't get all these things going. Just a thought you know?
Forgive my ignorance, but how would selenium help in getting the header for the request? It is computed by OnlyFans site itself and changing everyday. How would we extract the computing function from their source code just by using selenium?
Well, the sign is generated for every request, for example it is generated also for the login page.
It is based on time (not a problem), other cache variables like auth-id and xbc, which we can edit before making the JavaScript calculate the right headers, the only problem could be if the url is used too in the calculation of the sign.
So we open our selenium sessions on only fans, we edit our cache -> we refresh page, before request is sent we get the sign value, we then abort the request.
Now we have the sign to be used in our requests.
In this way we could have always the updated version of the sign since what we are doing is just using the last JavaScript from only fans servers.
I will try to make it and will let you know where I break, but this seems the only way to have always updated headers without moving all on selenium.
@dearcoding the url path is used in calculating the sha1 hash, and the hash is then used to compute the third parameter of the sign, so it’s not a value you can get once and then use it multiple times...
@dearcoding the url path is used in calculating the sha1 hash, and the hash is then used to compute the third parameter of the sign, so it’s not a value you can get once and then use it multiple times...
Yeah this means if we find the way to manipulate the url used in the calculation the job is done.
@dearcoding the url path is used in calculating the sha1 hash, and the hash is then used to compute the third parameter of the sign, so it’s not a value you can get once and then use it multiple times...
Yeah this means if we find the way to manipulate the url used in the calculation the job is done.
The url used is not the browser’s url, it’s a url handled internally to query the api depending on the content currently being displayed on the site and the new content being required by the user while scrolling down a page...
Not saying it can’t be done, just saying it would be kind of messy and error prone
Got this error too. Sadly I was busy working when the script worked.
@dearcoding the url path is used in calculating the sha1 hash, and the hash is then used to compute the third parameter of the sign, so it’s not a value you can get once and then use it multiple times...
Yeah this means if we find the way to manipulate the url used in the calculation the job is done.
The url used is not the browser’s url, it’s a url handled internally to query the api depending on the content currently being displayed on the site and the new content being required by the user while scrolling down a page...
Not saying it can’t be done, just saying it would be kind of messy and error prone
I don't know, i't just a frontend, usually this kind of controls are handled by backend.
Well at least we know it's automatically updated every 24 hours and not some guy on a computer updating it manually lmao.
so correct me if I'm wrong @DIGITALCRIMINAL, @hippothon & @trevdilley but this appears to be what our daily challenge will be in figuring out the sign:
sign example - 5:########################################:932:6091065f
5: <--- appears to increment daily ########################################: <--- created from combo of random daily static string/ salt, epoch timestamp, api path, userId all separated with "\n" and converted to sha1 932: <---- sha1 checksum 6091065f <--- hex conversion of vendor.js revision epoch timestamp
I wish I could be more help but I'm not very good at JS and def don't know how to reverse engineer it just doing my part to help
For the guy who constantly reduces the function, here it is the new one isolated, it computes the third parameter of the sign based on the SHA1 hash:
https://jsfiddle.net/85fyu9wc/
New "constants":
var str1 = "Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5"; var str2 = "6091065f"; var constNumber = 5;
I have doubts that their change is automated.
Yesterday it stopped accepting the old signature around 17:00 UTC, with the js being updated around 10:22 UTC (from filename). Today it stopped accepting the old signature around 09:00 UTC, with the js being updated around 08:30 UTC.
The discrepancy in client-side and server-side changes and the difference in times on different days tells me some guy was tasked with watching this github and trying to stay ahead of it.
Anyway the changes are
static_param = "Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5"
checksum = sum([sha_1_b[15], sha_1_b[37], sha_1_b[6], sha_1_b[9], sha_1_b[13], sha_1_b[34], sha_1_b[17], sha_1_b[14], sha_1_b[1], sha_1_b[37], sha_1_b[14], sha_1_b[18], sha_1_b[24], sha_1_b[28], sha_1_b[1], sha_1_b[31],
sha_1_b[13], sha_1_b[14], sha_1_b[15], sha_1_b[19], sha_1_b[9], sha_1_b[29], sha_1_b[30], sha_1_b[23],
sha_1_b[16], sha_1_b[13], sha_1_b[28], sha_1_b[35],
sha_1_b[15], sha_1_b[23], sha_1_b[28], sha_1_b[39]])-112
headers["sign"] = "5:{}:{:x}:6091065f".format(
sha_1_sign, abs(checksum))
Or for anyone using else js:
hash.charCodeAt(15) +
hash.charCodeAt(37) +
hash.charCodeAt(6) +
hash.charCodeAt(9) +
hash.charCodeAt(13) +
hash.charCodeAt(34) +
hash.charCodeAt(17) +
hash.charCodeAt(14) +
hash.charCodeAt(1) +
hash.charCodeAt(37) +
hash.charCodeAt(14) +
hash.charCodeAt(18) +
hash.charCodeAt(24) +
hash.charCodeAt(28) +
hash.charCodeAt(1) +
hash.charCodeAt(31) +
hash.charCodeAt(13) +
hash.charCodeAt(14) +
hash.charCodeAt(15) +
hash.charCodeAt(19) +
hash.charCodeAt(9) +
hash.charCodeAt(29) +
hash.charCodeAt(30) +
hash.charCodeAt(23) +
hash.charCodeAt(16) +
hash.charCodeAt(13) +
hash.charCodeAt(28) +
hash.charCodeAt(35) +
hash.charCodeAt(15) +
hash.charCodeAt(23) +
hash.charCodeAt(28) +
hash.charCodeAt(39) +
-112
I have doubts that their change is automated.
Yesterday it stopped accepting the old signature around 17:00 UTC, with the js being updated around 10:22 UTC (from filename). Today it stopped accepting the old signature around 09:00 UTC, with the js being updated around 08:30 UTC.
The discrepancy in client-side and server-side changes and the difference in times on different days tells me some guy was tasked with watching this github and trying to stay ahead of it.
Anyway the changes are
static_param = "Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5"
checksum = sum([sha_1_b[15], sha_1_b[37], sha_1_b[6], sha_1_b[9], sha_1_b[13], sha_1_b[34], sha_1_b[17], sha_1_b[14], sha_1_b[1], sha_1_b[37], sha_1_b[14], sha_1_b[18], sha_1_b[24], sha_1_b[28], sha_1_b[1], sha_1_b[31], sha_1_b[13], sha_1_b[14], sha_1_b[15], sha_1_b[19], sha_1_b[9], sha_1_b[29], sha_1_b[30], sha_1_b[23], sha_1_b[16], sha_1_b[13], sha_1_b[28], sha_1_b[35], sha_1_b[15], sha_1_b[23], sha_1_b[28], sha_1_b[39]])-112
headers["sign"] = "5:{}:{:x}:6091065f".format( sha_1_sign, abs(checksum))
Or for anyone using else js:
hash.charCodeAt(15) + hash.charCodeAt(37) + hash.charCodeAt(6) + hash.charCodeAt(9) + hash.charCodeAt(13) + hash.charCodeAt(34) + hash.charCodeAt(17) + hash.charCodeAt(14) + hash.charCodeAt(1) + hash.charCodeAt(37) + hash.charCodeAt(14) + hash.charCodeAt(18) + hash.charCodeAt(24) + hash.charCodeAt(28) + hash.charCodeAt(1) + hash.charCodeAt(31) + hash.charCodeAt(13) + hash.charCodeAt(14) + hash.charCodeAt(15) + hash.charCodeAt(19) + hash.charCodeAt(9) + hash.charCodeAt(29) + hash.charCodeAt(30) + hash.charCodeAt(23) + hash.charCodeAt(16) + hash.charCodeAt(13) + hash.charCodeAt(28) + hash.charCodeAt(35) + hash.charCodeAt(15) + hash.charCodeAt(23) + hash.charCodeAt(28) + hash.charCodeAt(39) + -112
Ahh true, lmao. Since you're here, how is the -112 calculated?
I have doubts that their change is automated.
Yesterday it stopped accepting the old signature around 17:00 UTC, with the js being updated around 10:22 UTC (from filename). Today it stopped accepting the old signature around 09:00 UTC, with the js being updated around 08:30 UTC.
The discrepancy in client-side and server-side changes and the difference in times on different days tells me some guy was tasked with watching this github and trying to stay ahead of it.
If that's the case then we must change it thousands of times until they get tired of changing it. It's not healthy to constantly change a production site's frontend and backend for reasons not related to improving user's experience or provide new features. Unless it's autommatically generated, they must stop at some point
yeah lets just wear this dude down :)
I love how everybody comes together to solve the problems.
You guys are fucking Wizards.
Ahh true, lmao. Since you're here, how is the -112 calculated?
The initial code after deobfuscation looks something like this:
hash.charCodeAt(15) - 142 +
hash.charCodeAt(37) - 124 +
hash.charCodeAt(6) - 147 +
hash.charCodeAt(9) - 84 +
hash.charCodeAt(13) + 83 +
hash.charCodeAt(34) - 101 +
hash.charCodeAt(17) - 76 +
hash.charCodeAt(14) + 124 +
hash.charCodeAt(1) + 107 +
hash.charCodeAt(37) + 151 +
hash.charCodeAt(14) - 147 +
hash.charCodeAt(18) - 79 +
hash.charCodeAt(24) + 90 +
hash.charCodeAt(28) - 59 +
hash.charCodeAt(1) + 121 +
hash.charCodeAt(31) - 98 +
hash.charCodeAt(13) + 119 +
hash.charCodeAt(14) - 77 +
hash.charCodeAt(15) - 84 +
hash.charCodeAt(19) - 72 +
hash.charCodeAt(9) + 139 +
hash.charCodeAt(29) + 121 +
hash.charCodeAt(30) - 79 +
hash.charCodeAt(23) + 135 +
hash.charCodeAt(16) - 83 +
hash.charCodeAt(13) + 69 +
hash.charCodeAt(28) - 83 +
hash.charCodeAt(35) + 89 +
hash.charCodeAt(15) - 98 +
hash.charCodeAt(23) - 76 +
hash.charCodeAt(28) + 148 +
hash.charCodeAt(39) + 101
After I posted that the first time someone pointed out it makes more sense to simplify the math so you just add all of those numbers together.
Can verify it's all working again, except max_threads being anything other than "1"
@hippothon
static_param = "Sx7FEcC7r5uKuCIzVljwS8gnZGhNprM5"
checksum = sum([sha_1_b[15], sha_1_b[37], sha_1_b[6], sha_1_b[9], sha_1_b[13], sha_1_b[34], sha_1_b[17], sha_1_b[14], sha_1_b[1], sha_1_b[37], sha_1_b[14], sha_1_b[18], sha_1_b[24], sha_1_b[28], sha_1_b[1], sha_1_b[31], sha_1_b[13], sha_1_b[14], sha_1_b[15], sha_1_b[19], sha_1_b[9], sha_1_b[29], sha_1_b[30], sha_1_b[23], sha_1_b[16], sha_1_b[13], sha_1_b[28], sha_1_b[35], sha_1_b[15], sha_1_b[23], sha_1_b[28], sha_1_b[39]])-112
headers["sign"] = "5:{}:{:x}:6091065f".format( sha_1_sign, abs(checksum))
Or for anyone using else js:
hash.charCodeAt(15) + ... hash.charCodeAt(39) + -112
Awesome. But where did you get salt, and sha1 bytes order? Is it possible to parser from JS path/text (with regular exp) with static requests?
@DIGITALCRIMINAL might as well keep this issue open. talk to you guys in a day....
@DIGITALCRIMINAL might as well keep this issue open. talk to you guys in a day....
I was just thinking the same thing... see you guys again tomorrow
@DIGITALCRIMINAL might as well keep this issue open. talk to you guys in a day....
Take it easy, Pal.
Can verify it's all working again, except max_threads being anything other than "1"
Hey there. Long time listener, first time caller. I'm confused. I'm still getting the same error as an hour ago.. it's working on your end?
Can verify it's all working again, except max_threads being anything other than "1"
Hey there. Long time listener, first time caller. I'm confused. I'm still getting the same error as an hour ago.. it's working on your end?
Use the latest commit.
Can verify it's all working again, except max_threads being anything other than "1"
Hey there. Long time listener, first time caller. I'm confused. I'm still getting the same error as an hour ago.. it's working on your end?
Use the latest commit.
Ahh... of course. Thank you
Got it back working. But its only showing 12 of the 100+ subscriptions i have. And after each scrape, its hanging on the downloading messages part.
Can verify it's all working again, except max_threads being anything other than "1"
Hey there. Long time listener, first time caller. I'm confused. I'm still getting the same error as an hour ago.. it's working on your end?
Use the latest commit.
i am using latest commit but still having the refresh page issue on my end
Managed to sign in but when I tried to scrape, I'm getting this
Type: Profile 0.00B [00:00, ?B/s] Type: Stories No Stories Found. Type: Posts Scrape Attempt: 1/100 Missing 100 Posts... Retrying... Scrape Attempt: 2/100 Missing 50 Posts... Retrying... Scrape Attempt: 3/100 Missing 50 Posts... Retrying... Scrape Attempt: 4/100
Can verify it's all working again, except max_threads being anything other than "1"
Did this and I was getting the above so I changed it to 1 and it's working again.
I had this issue, updated to the latest version and it works again, however:
One onlyfans model had a post mixed with images and videos. The images downloaded fine, the videos didn't.
I tried re-running the script and now it appears to be hanging on that model's videos with "0it [00:00, ?it/s]"
I've seen something about max_threads being some kind of cure-all, but where does it get set? I have it set in the onlyfans section of the config.json dict, is that right?
I'm at a good point in using selenium to get the sign value.
It's actually a good and working way as far as i tested it.
The bot will become slower because for every request i need 3/5 seconds to generate the sign, but in this way they can update their algorithm as much as they want, i will always have the right sign.
I won't run it completely in selenium because selenium is shit, memory heavy and hard to manage...
I use selenium only for the sign calculation.
I had this issue, updated to the latest version and it works again, however:
One onlyfans model had a post mixed with images and videos. The images downloaded fine, the videos didn't.
I tried re-running the script and now it appears to be hanging on that model's videos with "0it [00:00, ?it/s]"
I've seen something about max_threads being some kind of cure-all, but where does it get set? I have it set in the onlyfans section of the config.json dict, is that right?
@salamihawk You change it at Line 8 in config.json.
Awesome. But where did you get salt, and sha1 bytes order? Is it possible to parser from JS path/text (with regular exp) with static requests?
It's just all from removing their obfuscation. For example this highlighted bit:
Translates to
e.charCodeAt(2434 % e.length) - 101
Or since we know it's SHA1 that would be
e.charCodeAt(2434 % 40) - 101
Which in the snippet I posted above is then
e.charCodeAt(34) - 101
It's likely possible to do it statically, I have it 99% automated using regex but it runs in the browser so I can use their text replacement functions without having to copy them and mess with them myself. The risk with regex is they change something substantially and you have to start over again.
Realistically right now it takes me about 5 min to update my code after noticing it's broken. If it keep changing every day I might be more motivated to fully automate it but that's up to them ¯\(ツ)/¯
So are we just waiting on a new build?
It's just all from removing their obfuscation. For example this highlighted bit:
Translates to
e.charCodeAt(2434 % e.length) - 101
Or since we know it's SHA1 that would bee.charCodeAt(2434 % 40) - 101
Which in the snippet I posted above is thene.charCodeAt(34) - 101
Pretty good. Thanks!
I had this issue, updated to the latest version and it works again, however: One onlyfans model had a post mixed with images and videos. The images downloaded fine, the videos didn't. I tried re-running the script and now it appears to be hanging on that model's videos with "0it [00:00, ?it/s]" I've seen something about max_threads being some kind of cure-all, but where does it get set? I have it set in the onlyfans section of the config.json dict, is that right?
@salamihawk You change it at Line 8 in config.json.
Gotcha, thanks... I was still working with an old file from an old version before the auth config got split off to .profiles
Still seems to hang at the same spot though
So I think it's obvious that they're watching this page since they just pushed an update that tries to interfere with you using devtools. I'll continue to share changes but I'd advise against anyone sharing specific methods.
I don't have access to my code right now but will post the update for version 6 later.
So I think it's obvious that they're watching this page since they just pushed an update that tries to interfere with you using devtools. I'll continue to share changes but I'd advise against anyone sharing specific methods.
I don't have access to my code right now but will post the update for version 6 later.
I have a private method to compute sign using selenium, I want help community but i don't want publish it here since it would get patched quickly.
If anyone is interested please contact me on email (you get it on my profile).
So I think it's obvious that they're watching this page since they just pushed an update that tries to interfere with you using devtools. I'll continue to share changes but I'd advise against anyone sharing specific methods.
I don't have access to my code right now but will post the update for version 6 later.
What??? Those bastards!
I fill the auth.json but when I start the script it says that I should refresh the page. Does this happen to anyone else? Is this a bug or I'm doing something wrong?