Closed tleydxdy closed 4 years ago
How do you know that the quic workaround is banned by Google? I do agree that it may fix #917 but if the workaround is still working why reverting to a state where almost every invidious would stop working due to the Google captcha?
Use https://anti-captcha.com will solve the captcha problem.
@unixfox because people are being banned?
@tleydxdy I didn't experience any ban at all (my instance is yewtu.be). The only thing that I did is to block all the API endpoints except from the ones that the web interface uses. This most likely reduced the amount of requests to Google but I didn't experience any ban yet. I'm pretty sure that if Invidious wasn't using the quic workaround my instance would have been banned a long time ago (from my experience with dealing with reCaptcha on Searx).
@tleydxdy I didn't experience any ban at all (my instance is yewtu.be). The only thing that I did is to block all the API endpoints except from the ones that the web interface uses. This most likely reduced the amount of requests to Google but I didn't experience any ban yet. I'm pretty sure that if Invidious wasn't using the quic workaround my instance would have been banned a long time ago (from my experience with dealing with reCaptcha on Searx).
Hi sir, may i know how to block all the API endpoints except for web uses like you said? I want to implement it on my instance too. Please help.
@tleydxdy I didn't experience any ban at all (my instance is yewtu.be). The only thing that I did is to block all the API endpoints except from the ones that the web interface uses. This most likely reduced the amount of requests to Google but I didn't experience any ban yet. I'm pretty sure that if Invidious wasn't using the quic workaround my instance would have been banned a long time ago (from my experience with dealing with reCaptcha on Searx).
Hi sir, may i know how to block all the API endpoints except for web uses like you said? I want to implement it on my instance too. Please help.
I just used the status parameter of Caddy like this:
status 403 {
/api/v1/videos
/api/v1/channels
/api/v1/search
/api/v1/mixes
}
I use Debian 10, can you advice what file should i edit?
If you installed the Caddy webserver with this script: https://github.com/sayem314/Caddy-Web-Server-Installer
Then it's in the /etc/Caddyfile
If you installed the Caddy webserver with this script: https://github.com/sayem314/Caddy-Web-Server-Installer Then it's in the
/etc/Caddyfile
Ok sir, i will try it. Thanks a lot for the help! 👍
There are two different kinds of CAPTCHAs:
The first is similar to one of the reported errors in TeamNewPipe/NewPipe#2924, and looks like this:
(For reference, the "submit" button makes a POST request to https://www.youtube.com/das_captcha
, with the result of the CAPTCHA as "g-captcha-response" IIRC).
After a successful POST YouTube returns a new cookie goojf
that the client can then use for subsequent requests.
The second one is more generic and looks like this:
After a successful POST (to https://www.google.com/sorry/index...
you receive a GOOGLE_ABUSE_EXEMPTION
cookie that is valid for around 6 hours (the cookie itself has an expires
value or similar that you can use).
The goojf
cookie provided by the first does not consistently prevent future captchas, and is not practical to bypass using something like anti-captcha (see #886). This captcha is completely bypassed when using QUIC. This is also why you will never see this type of CAPTCHA when using Chrome (except on first load), since all subsequent requests use QUIC.
The GOOGLE_ABUSE_EXEMPTION
cookie will consistently prevent captchas from appearing until it expires. This is the captcha that is actually being bypassed when using anti-captcha.
@omarroth
Do you plan to support the cookie GOOGLE_ABUSE_EXEMPTION
for anti-recaptcha? My instance is not blocked for viewing videos but for the channels.
When invidious is fetching the channel info it gets the second type of block that you explained with "/sorry/index".
Thus, the automatic captcha solving doesn't work because invidious doesn't check if the instance is partially blocked. Like only for fetching the channels.
Do you plan to support the cookie
GOOGLE_ABUSE_EXEMPTION
for anti-recaptcha?
This is the only cookie that is currently supported.
For clarification, what does e.g.
$ curl -sD - -o /dev/null 'https://www.youtube.com/browse_ajax?continuation=4qmFsgI8EhhVQ2EzamdoSUxCa3BiTW03bnBoeGlCcUEaIEVnWjJhV1JsYjNNd0FqZ0JZQUZxQUxnQkFDQUFlZ0V4&gl=US&hl=en'
return for you? (you may also need to specify curl -4
or curl -6
).
That's strange because the automatic anti-recaptcha never wants to activate itself. I though the anti-recaptcha was only designed for watching videos according to the source code: https://github.com/omarroth/invidious/blob/master/src/invidious/helpers/jobs.cr#L239
I'm on the phone but the curl command should returns the same second page with "our systems have detected...".
Everytime I fetch a channel I get a JSON::ParseException like described in #963
My bad you are right @omarroth, it does indeed support the cookie GOOGLE_ABUSE_EXEMPTION
.
But as you can see it check only if the instance is blocked for video loading: https://github.com/omarroth/invidious/blob/master/src/invidious/helpers/jobs.cr#L239.
I modified the URL to /browse_ajax?continuation=4qmFsgI8EhhVQ2EzamdoSUxCa3BiTW03bnBoeGlCcUEaIEVnWjJhV1JsYjNNd0FqZ0JZQUZxQUxnQkFDQUFlZ0V4&gl=US&hl=en
and the anticaptcha worked.
Can you add that new URL in the source code or come up with a way to detect if a request that invidious does is redirected to /sorry/index
then trigger the bypass_captcha
function?
I have similar behavior but with video informations, like comments, likes etc
After a successful POST (to
https://www.google.com/sorry/index...
you receive aGOOGLE_ABUSE_EXEMPTION
cookie that is valid for around 6 hours (the cookie itself has anexpires
value or similar that you can use).
I'm trying to implement anti-captcha for NewPipe. Currently I receive second type of captcha - "https://www.google.com/sorry/index..." and try to make post with 3 params: "q", "continue" and "g-recaptcha-response" but never receive GOOGLE_ABUSE_EXEMPTION cookie nor any redirect url. What I do wrong?
What's the error message given by Google? Also what's the status code when doing a request? If it's a 400 status code then there is something wrong in your code.
It remains the same page with same url "https://www.google.com/sorry/index.." and 429 status code, like I didn't post at all.
Is your request a POST request?
Also is your request body converted from query strings and has a Content-Type
header of application/x-www-form-urlencoded
?
It is also preferred to specify the referrer.
Here is an example of a body made by a browser:
g-recaptcha-response: 03AERD8Xp5eQ8xX4nwTMr3_8OzfFyoU4IDcMW6ealj6gUNVsCSmB2AlZDuXtKkjIoCICyO5ZBK_mFfGKaXOjGqkHNvVkXhHmAPNCsU2FRip2hweFGYSVrgRzVRyeVKStSFM5WkLfxMXlp_2L-Liu6JCPo_LS_-0yJqA1zyAN6diQRyqEduU7qp6Lo0MhciuTj0SlAxzV2WDaIgubS_pd9x8gqfsCa6rEJ2y8tVyD-m_k1TJmcrUQlpsuRMnRfsM2BFggApYZ8TGTC5y-breO3IlnMsxKMa9-g6jt3IBVHE3BZ8mMcdTdp1A0En7_fkeZvpUM7BKTtwVu9Y4fc-9G5aeDRp6D8RseAN-rEng9S6lA_g91EhGqaaw33vZt4S0HQMbMqVeCoVCrdGtpevIUrEfjSrv7RjSUVC8WQzRmwAc4R4KDIqC_DQ_tGf5dBpY9HMihJvhP-twAdRTPWsDUDlrirpdL19bWimHg
q: EhAgAQZ8JmAEJQABAAAAAAmCGLqqo_MFIhkA8aeDS6ASm_qRFdynMgfJqm_jtxy0t4GDMgFy
continue: https://www.google.com/search?q=test
The best way to know if your request is correct or incorrect is to use a proxy like mitmproxy and compare your request with a request made in a browser.
EDIT: Here is an example code from one of my project: https://github.com/unixfox/proxy-sorry-google-recaptcha/blob/master/anticaptcha.js#L53. I hope this will help you.
Yes, I do POST with okhttp3.
FormBody.Builder formBodyBuilder = new FormBody.Builder();
for (Map.Entry<String, String> entry : mCaptchaInputs.entrySet()) {
formBodyBuilder.add(entry.getKey(), entry.getValue());
}
okhttp3.Request request = new okhttp3.Request.Builder()
.url(mCaptchaPostUrl)
.addHeader("User-Agent", USER_AGENT)
.addHeader("Accept-Language", "en-GB, en;q=0.9")
.addHeader("Content-Type", "application/x-www-form-urlencoded")
.addHeader("X-YouTube-Client-Name", "1")
.addHeader("X-YouTube-Client-Version", "2.20200214.04.00")
.post(formBodyBuilder.build())
.build();
okhttp3.Response response = client.newCall(request).execute();
I can confirm that "q" value I post is the same as located in page.
I see omarroth closes previous connection just before POST. I parse the page, close connection, wait for the captcha task and then POST. Could it be the reason?
LOL. In case anybody need this: It was auto redirect of okhttp. I was getting a cookie and a redirect to the original url. Regular browser would set the cookie and redirect you to your page, but I was getting redirect without setting a cookie, so redirected again to a new captcha page.
I had the same issue with got, that's why I had to set methodRewriting
to false
here: https://github.com/unixfox/proxy-sorry-google-recaptcha/blob/master/anticaptcha.js#L68.
Oh. I wish I understand JS well. In any case thank you for help!
Now that using quic will be banned too, I think we should go back to using http, as it's less dependency, and less fragile (on Alpine at least). or offer it as a build option?