bernd-wechner / Degoo

CLI tool(s) for working with Degoo cloud storage
Other
143 stars 41 forks source link

403 Error when trying to login #42

Closed markdid closed 10 months ago

markdid commented 2 years ago

I tried installing everything but I get this error

Login failed with: 403: Forbidden
Login failed.
preethaml7 commented 2 years ago

I am getting the same error as well

bernd-wechner commented 2 years ago

Haven't looked this way in months alas, Been too buys. But decided it's Degoo night today, so taking a look at things. Just tried this myself and got same response. Bummer. Will take a closer look.

bernd-wechner commented 2 years ago

Well, I have banged my head up against this for a while now and it is alas too hard to solve today. We will need some deeper insights into Cloudflare and/or a really good low level request debugger. I have experimented with fiddler to little avail in past alas. But to be clear and document the issue more technically and in way that might permit any erudite reader to comment or pursue a diagnosis, here is what the problem is.

Historically this reverse engineering, and indeed most like it of undocument web APIs, is based upon watching the Network tab in a browser and reproducing the requests seen therein. There are two main browser families of interest, Firefox and Chrome.

Each of these browsers, on the developer tools has a network tab on which all requests and response can be read. Each request can be copied in a number of formats, but a useful one for quick testing is the curl format. When logging in with Firefox (successfully) the request the browser sends is:

curl -i 'https://rest-api.degoo.com/login' -X POST -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br' -H 'Content-Type: application/json' -H 'Referer: https://app.degoo.com/' -H 'Origin: https://app.degoo.com' -H 'Sec-Fetch-Dest: empty' -H 'Sec-Fetch-Mode: cors' -H 'Sec-Fetch-Site: same-site' -H 'DNT: 1' -H 'Sec-GPC: 1' -H 'Connection: keep-alive' --data-raw '{"Username":"myemail_redacted","Password":"mypassword_redacted","GenerateToken":true}'

and similarly Chrome on a successful login reports this request was sent:

curl -i 'https://rest-api.degoo.com/login' \
  -H 'sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100"' \
  -H 'Referer: https://app.degoo.com/' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'Content-Type: application/json' \
  --data-raw '{"Username":"myemail_redacted","Password":"mypassword_redacted","GenerateToken":true}' \
  --compressed

It is by replicating these requests that this API was reverse engineered in the first place. Now there is a longstanding problem we know and has been oft reported by users, and this that either of these requests issued at the command line produces:

HTTP/2 429 
date: Wed, 30 Mar 2022 11:11:28 GMT
content-length: 0
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
vary: Accept-Encoding
server: cloudflare
cf-ray: 6f40679b3a785ac8-MEL

429 is a Too many requests error: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429

The login is rejected, and from this we can divine that Degoo is hiding behind a cloudflare firewall and it is cloudflare that is rejecting us.

The puzzle to solve is why? It lets Firefox log in and Chrome, but not command line curl.

As cloudflare can only see what is transmitted on the wire (the request sent) we must conclude that these requests captured by Firefox and Chrome are not true (and this warrants a bug report to each of these browsers, to pursue at leisure). Because if Firefox sends the request cloudflare process it, if curl sends it they reject it with a 429. If curl could reproduce exactly what Firefox sends then cloudflare cannot tell the difference and must log us in. Because it can tell the difference something in the request report delivered by these browsers is broken.

To diagnose what is difficult. One way is to watch the transaction on Wireshark. But no, that does not work with https, it is after all, secure. There are apparently methods: https://unit42.paloaltonetworks.com/wireshark-tutorial-decrypting-https-traffic/ but they will take time to study and research and I am no hopeful that they will yield fruit.

Then there is fiddler. It also has problems of course with https, but is more hopeful and I started working with it some while ago but it is time consume and I have faltered.

And then it struck me, it would be powerfully useful if we had a request mirror. A service that we could send the request to, so just replacing the URL of the POST request with the mirror and it reveals the exact request byte for byte that was received. We could point redirect the Firefox login request to it, to diagnose and then the curl one and see where the differ. They must differ.

I requested information about such services here:

https://stackoverflow.com/questions/67503099/is-there-a-web-service-on-line-or-locally-hostable-that-will-simply-reflect-th

but someone (for whom I have little respect as a consequence) downvoted it and the question was hidden. If it's possible to go upvote it, please do, and we can try to tap into the Stack Overflow community for wisdom in this space.

I have had more luck here:

https://github.com/xnbox/DeepfakeHTTP/issues/2

and DeepFakeHTTP is sometime to try, but again will take time to install and use to perform these tests. This is perhaps the hottest lead so far though.

Another avenue for possible support is cloudflare themselves. Searching on line with "cloudflare 429 errors" reveals a lot to read and lots of people facing these messages in many contexts. Sifting information from this haystack is again a time consuming job.

Getting help from degoo themselves is also an option but they do not generally respond to support requests and have not been supportive of these efforts or forthcoming with any API specifications or even hints.

At which point I invite anyone eager to help this project along to select one of these research paths and report back. I too will do that but as you will have observed I have basically the standard FOSS problem: a day job, children to feed, dozens of projects and commitments and a lot more going on to boot, so it can be rather difficult to get my attention on any one project with months between.

ttonin33 commented 2 years ago

This problem can easily be worked around: https://github.com/MDKPredator/degoo_drive#login-bypass

By manually inserting the token I managed to login, but I only succeeded with "degoo_user" and with other commands I get back "400", which is probably due to my assembled "keys.json" file and the other missing.

After some testing and copying of "default_properties.txt" into the config folder I was able to use this tool and avoid the 403 error.

I would like to have a function that creates these files automatically so that you only have to copy the token.

When uploading, no download link was generated, but I was able to fix this with: https://github.com/bernd-wechner/Degoo/pull/41/commits/3930b9331be7195bc4d86017e9f69828b6591dea

TheTechRobo commented 2 years ago

Have you tried Firefox's "copy as cURL" right-click option in the Network tab?

As in: Right click on a request, hover over "Copy", then click "Copy as cURL".

bernd-wechner commented 2 years ago

Precisely what I do, and my recommendation.

TheTechRobo commented 2 years ago

Ah, sorry. Didn't see that anywhere.

buzzy commented 2 years ago

To see exactly what a browser is sending, why would you not just create a simple socket server and print the request? Super simple. Something like: https://gist.github.com/pedrominicz/b699dec01b4afd38b8aba83f3089e175

bernd-wechner commented 2 years ago

Sure thing, thanks for the tip and link. That said, confidence regarding what the browser sends and receives is not the issue at hand as browsers provide kick ass excellent and trustworthy dev tools built in for that (though confirming it with an echo server is still a good double check), but knowing exactly what the Python libs send is the trick (and there, a good reliable echo server would help indeed).

As the task, in reverse engineering anything on the web is to:

  1. see exactly what the web browser sends and receives
  2. send exactly the same thing and (hopefully) receive the same response

A strong premise stands that all the server sees is what is sent in the request. And so if the request can be copied exactly then the server must respond in exactly the same way. Or, conversely, if the server is not responding in the same way we must assume we are not sending an identical request! But given that is the intent and it is being defined to, then the assumption becomes that the Python lbs between assembling the request and it's being sent, alter it in some way,

And this is what remains to be diagnosed. But hey, if it's so simple and you have time, chip in and set up an echo server, and document what the browser sends for a login request and what thjis script sends and the differences. It will help us all along.

buzzy commented 2 years ago

No sorry. I am building my own FUSE filesystem for Degoo in GoLang instead. I will publish it for free when it's done.

bernd-wechner commented 2 years ago

Wish you'd show us how to do that in Python.

buzzy commented 2 years ago

Well, one hot tip is to look at TLS. As an example, you are sending a user-agent as "Chrome version X". Cloudflare could easily have a list of TLS ciphers supported by that version. If the TLS ciphers supported in the SSL handshake differs from that list, you are obviously a "fake" Chrome client :)

"During the initial handshake sequence between the client and the server, the client presents its cipher list to the server, and the server selects one of the ciphers. Or, if the server does not support any of those ciphers, the server rejects the client request."

https://cabulous.medium.com/tls-1-2-andtls-1-3-handshake-walkthrough-4cfd0a798164

Here is even a proof of concept in Python for you: https://github.com/salesforce/ja3

"The JA3 algorithm takes a collection of settings from the SSL "Client Hello" such as SSL/TLS version, accepted cipher suites, list of extensions, accepted elliptic curves, and elliptic curve formats."

Your current Python HTTP requests are easily identified and Degoo has obviously put Rate Limit of 0 requests for requests made from a Python SSL fingerprint for the "/login" URI.

bernd-wechner commented 2 years ago

Will read that now closely as time permits, but thanks for the tip. Have you solved this login problem perchance in Go?

bernd-wechner commented 2 years ago

Had a quick look. It seems indeed that one state of the art client differentiation rests on TLS handshake analysis. Ouch.

The JA3 algorithm for example examining these fields in client HELLO message:

SSLVersion,Cipher,SSLExtension,EllipticCurve,EllipticCurvePointFormat

Egads. So the Python request would need to ideally then, to mimic Firefox's not just at the HTTP level but at the TLS handshake level too (duplicating the Client HELLO). Leaves me to wonder if that's easy to do with the Python libs. Something to drill into.

bernd-wechner commented 2 years ago

I had a look in recent day's more closely using the HTTP Header Live addon:

https://github.com/Nitrama/HTTP-Header-Live/

and an oddity I note too is that the Firefox request orders the headers in a way that the Python does not::

Host: rest-api.degoo.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://app.degoo.com/
Content-Type: application/json
Origin: https://app.degoo.com
Content-Length: 82
Connection: keep-alive
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site

Notably, the Content-Length: header is always at the end just preceding the data in the Python requests, raising the spectre that header ordering may also be analysed to differentiate the clients.

buzzy commented 2 years ago

Had a quick look. It seems indeed that one state of the art client differentiation rests on TLS handshake analysis. Ouch.

The JA3 algorithm for example examining these fields in client HELLO message:

SSLVersion,Cipher,SSLExtension,EllipticCurve,EllipticCurvePointFormat

Egads. So the Python request would need to ideally then, to mimic Firefox's not just at the HTTP level but at the TLS handshake level too (duplicating the Client HELLO). Leaves me to wonder if that's easy to do with the Python libs. Something to drill into.

Yes. Not very hard to do though, as there are several Python libraries to "mimic" a known fingerprint. All you have to do is to use a valid user-agent + fingerprint that matches. The lib will make sure to construct the TLS handshake so that the result becomes the wanted fingerprint.

bernd-wechner commented 2 years ago

Cool, what are the Python libs packages)? Can you name or link to some? Would save us a bucket of time (over searching for candidates unless lucky with the first few keyword choices).

I'd love to try one.

buzzy commented 2 years ago

Cool, what are the Python libs packages)? Can you name or link to some? Would save us a bucket of time (over searching for candidates unless lucky with the first few keyword choices).

I'd love to try one.

Dude, 5 seconds Googling :) I don't use Python, so I wouldn't know. Not sure if this one is complete. I use Go myself. https://geeksrepos.com/an0ndev/requests-ja3

The most annoying thing is that Degoo are only doing this check on /login (which is a REST API) and not on the regular GraphQL API. Also, this trips when using some proxies, so very strange move by Degoo.

bernd-wechner commented 2 years ago

Nothing is 5 seconds googling my friend... let alone finding esoteric python packages for mimicking tls handshakes. I was only asking because you said there were several and so may know and be able to name them. I'll take a look when I'm off the phone and back on a PC. And see how much Googling it takes 😉

buzzy commented 2 years ago

My bad. I don't do Python. I assumed, as there are many for Go :)

bernd-wechner commented 2 years ago

No worries. I'm grateful for your insights and tips!

I have had a spare moment on a laptop just now and found these which are interesting:

https://github.com/lwthiker/curl-impersonate

Which means I can try to reproduce the login with curl to see if the theory of handshake fingerprinting to differentiate clients is the actual cause for the rejection of our login efforts. That curl impersonates browser TLS handshakes and so when I get a chance to try it if I can login via curl then we have proof of this theory!

https://github.com/tlsfuzzer/tlslite-ng

Which promises to provide nuanced TLS handshake control in Python. Just based on a cursory reading so far.

More updates as I check these threads, and thanks heaps for the lead.

buzzy commented 2 years ago

You don't need to confirm it. I already did. It's easy with docker. You get a 302 instead of a 429 response.

docker pull lwthiker/curl-impersonate:0.5-chrome
docker run --rm lwthiker/curl-impersonate:0.5-chrome curl_chrome101 'https://degoo.com/me/login' \
  -H 'sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100"' \
  -H 'Referer: https://app.degoo.com/' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'Content-Type: application/json' \
  --data-raw '{"Username":"myemail_redacted","Password":"mypassword_redacted","GenerateToken":true}' \
  --compressed
bernd-wechner commented 2 years ago

Cool. Thanks for that. Why docker out of interest? What value does it add there? I'd have just run curl-impersonate myself.

The 302 is the redirect to the page after logging in I imagine. Let's see if I can find a python package that does what curl-impersonate does. tlslite-ng sort of suggests it's doable building on the sockets package.

buzzy commented 2 years ago

Because the docker one does not require any installation of extra libs etc. I don't like to bloat my system with random hacks :) If you use libcurl with Python, you can just switch it out to the patched one and it should "just work". But forcing people to run a bundled libcurl is probably not a good idea.

And yes, the 302 is just the redirection to /app, so it means success.

bernd-wechner commented 2 years ago

There may be a way to do it without libcurl:

https://www.scrapingbee.com/blog/web-scraping-without-getting-blocked/#tls-fingerprinting

which directs to:

https://hussainaliakbar.github.io/restricting-tls-version-and-cipher-suites-in-python-requests-and-testing-with-wireshark/

Which will take some reading, thinking and tinkering. All been more than 5 seconds google by a long way, but I'm still looking. Turns out we are far from alone, there is a busy world of people trying to mimic browser TLS fingerpirnts for programmatic access to websites ... it's a classic arms race situation.

Looks to me like it's cloudflare not degoo who add that layer of security too BTW. Not convinced Degoo, as a business care. But cloudflare are a much bigger player with many many customers who have a strong interest in security ...

buzzy commented 2 years ago

Of course it's Cloudflare that blocks, but it's Degoo that has enabled the "Browser Integrity Check" setting in Cloudflare. This comes with an extra fee as it requires more compute power for each request, so Degoo definitely knows what they are doing and fully responsible for this. Remember, if they enable this for the GraphQL as well, your entire application will stop working, not just the login.

It would have taken you 5 seconds for Go :) It's just a lib and 1 line of extra code. Python just sucks, like always :P

bernd-wechner commented 2 years ago

Python doesn't suck, it's lovely. But thanks for your tips and I hope the reverse engineering I did here help you with your project. I look forward to seeing the results and get a fuse for degoo. I'll see if I can get the TLS handshake spoofed some this is nice too:

https://scrapfly.io/blog/how-to-avoid-web-scraping-blocking-tls/

And suggests two Python methods for fixing it. Without, alas, providing a clear example of how simply to spoof say Firefox or Chrome like curl-imperonsate does.

buzzy commented 2 years ago

Just an update that basic file listing works in my FUSE FS works now, so I can confirm that FA3 fingerprinting solves the login problem. I think you will have a hard time making it work in Python though, as it does not have the same access to the network stack as GoLang has. Hopefully it's just enough to change the ciphers to fool Cloudflare.

zioalex commented 2 years ago

HI all, which big thread here....Unfortunately I see that the login still doesn't work. Any update on that front?

bernd-wechner commented 2 years ago

Alas no, as noted we now know why, it's in the TLS handshake and we've not had time to find a Python solution for mimicking a browser's TLS handshake yet. One will emerge in time I am sure as this is a relatively young security protocol that is getting in lots of people's way alas.

As noted earlier there is an outline of how to fix it here:

https://scrapfly.io/blog/how-to-avoid-web-scraping-blocking-tls/#python

But alas not clear guidance onto how to replicate the JA3 fingerprint of common browsers. So it needs more time thatn I've been able to give it to evolve that. But is emminently diable it seems.

bernd-wechner commented 2 years ago

OK, had a moment this morning as nasty rain prevented me doing other things I had planned, so took a look. It's not looking good. The problem is stated simply as follows:

  1. There's a JA3 fingerprint of the TLS handshake that is being used to identify the client securely.
  2. Python has ways for defining the TLS handshake and a great outline is here: https://scrapfly.io/blog/how-to-avoid-web-scraping-blocking-tls/#python
  3. Unfortunately the JA3 fingerprint is a function of the TLS handshake and that includes the select cipher suites. but TLS1.3 context definition is not well supported in the Python ssl library yet: https://docs.python.org/3/library/ssl.html#tls-1-3
  4. I short, until Python's ssl library allows us to configure the TLS 1.3 handshake context freely, this may be a stuck problem.
  5. This will be affecting nay and every Python based tool connecting over TLS1.3 to an endpoint that implements JA3 fingerprint screening to exclude bots alas. The upside being that this might mean a python solution emerges in the coming year. We shall see.
zioalex commented 2 years ago

Hi @bernd-wechner thanks for the follow up and clarification. It is pity that we cannot use the cli for the time being. This is shrinking the degoo usability in my perspective.

buzzy commented 2 years ago

Hi @bernd-wechner thanks for the follow up and clarification. It is pity that we cannot use the cli for the time being. This is shrinking the degoo usability in my perspective.

Don't worry. My degoo fuse filesystem is pretty much done. I just want to do some more testing before releasing it. Maybe even this weekend.

bernd-wechner commented 2 years ago

You're a champ, buzzy. Thanks so much! Is it up on a github repo?

ttonin33 commented 2 years ago

As i previusly posted there is a woraround with the broser to get the login tokins: https://github.com/bernd-wechner/Degoo/issues/42#issuecomment-1100919703

Hi @bernd-wechner thanks for the follow up and clarification. It is pity that we cannot use the cli for the time being. This is shrinking the degoo usability in my perspective.

buzzy commented 2 years ago

But tokens keeps expiring after a while, so it's just a temporary "hack".

nichlasjo commented 2 years ago

Could the login problem be fixed by mimicking a full browser with selenium like in this project ? https://github.com/tzchz/degoo-active

bernd-wechner commented 10 months ago

This should now be fixed by:

https://github.com/bernd-wechner/Degoo/commit/bc92782790771faacd506e4d110622633bc120b6

though the whole package remains a WIP (work in progress) and under testing. Not least as the degoo API changed in the interim and I've also made some updates here, but am not done with testing and it's one of far too many pies I have my fingers in and tabs I have open ;-).