google / recaptcha

PHP client library for reCAPTCHA, a free service to protect your website from spam and abuse.
http://www.google.com/recaptcha/
BSD 3-Clause "New" or "Revised" License
3.48k stars 771 forks source link

Create proxy handlers for China. #87

Closed pauldotknopf closed 6 years ago

pauldotknopf commented 8 years ago

I don't know where to start this discussion so I will start it here.

China has blocked recaptcha. The service is awesome, so I'd like to attempt to workaround this.

My idea would be this.

  1. All network requests are proxies through a server handler. "http://mydomain.com/recpatcha/http://google.com/recpatcha/image.jpeg" and "http://mydomain.com/recpatcha/http://google.com/recpatcha/script.js" would be proxied and served as if the content came from "mydomain.com".
  2. There may need to by some "live" modification of the scripts and styles returned from the proxy so that all urls and domains have the "http://mydomain.com/recaptcha" preappended to them.

What do you guys think? I haven't tried it yet, but does anyone know if there would be any hangups with this approach?

paragonie-scott commented 8 years ago

93 should help here.

zypA13510 commented 7 years ago

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

tabjy commented 7 years ago

@zypA13510

What did you do to https://www.google.com/recaptcha/api.js and https://www.gstatic.com/recaptcha/api2/r20170206171236/recaptcha__en_gb.js?

I tried to modify the hostnames in those files, but the browser ends up sending request to https://www.google.com/recaptcha/api2/userverify all the time...

Thanks in advance

zypA13510 commented 7 years ago

@Tabjy Short answer: AddOutputFilterByType and Substitute. For details, refer to the apache documentation Make sure you enabled the related modules in your apache configuration file. If the setup is sucessful, you shouldn't see any request to www.google.com or www.gstatic.com in your network requests. Feel free to ask if you need more help.

barene commented 7 years ago

unsubcribe

Am 16.02.2017 um 14:39 schrieb Tabjy:

@zypA13510 https://github.com/zypA13510

What did you do to |https://www.google.com/recaptcha/api.js| and |https://www.gstatic.com/recaptcha/api2/r20170206171236/recaptcha__en_gb.js|?

I tried to modify the hostnames in those files, but the browser ends up sending request to |https://www.google.com/recaptcha/api2/userverify| all the time...

Thanks in advance

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/recaptcha/issues/87#issuecomment-280332547, or mute the thread https://github.com/notifications/unsubscribe-auth/AQZcai31aj8aZnSjO4zSgWksf4sc0CAPks5rdFGvgaJpZM4HB-kt.

-- Rene Ballmann Software-Engineer

Animate, Agentur für interaktive Medien GmbH Hertha-Genzmer-Str. 1 65197 Wiesbaden

Fon: +49-611-945808501 Fax: +49-611-36028776

Registergericht: Amtsgericht Wiesbaden Registernummer: 21 HRB 12737

Geschäftsführer: Philipp Giere, Thomas Elsäßer

http://www.animate.de/

tabjy commented 7 years ago

@zypA13510 I'm not using Apache here but node.js. I did something similar to substitute all google's domain to mine. It turns out I forgot to proxy /recaptcha/api2/anchor. However, after fixing this issue, I got following error from the browser

Uncaught DOMException: Failed to construct 'Worker': Script at 'https://www.google.com/recaptcha/api2/webworker.js?hl=en&v=r20170213115309' cannot be accessed from origin 'http://localhost:3000'.

I'm absolute sure that I substituted all domain in /recaptcha/api2/anchor. So I did some research and found out that "Google implemented a whole VM in JavaScript with a specific bytecode language" according to neuroradiology. Maybe the domain is embedded in those byte codes? Since this is way beyond my knowledge, I'll have to give up on this. Anyway, thanks for your help.

zypA13510 commented 7 years ago

@Tabjy

  1. Sorry, there're some minor issues in the original code I provided. I have updated the code now (tested and working). Please update your code accordingly and test again.
  2. The URL in the bytecode stream is not critical according to my research. The recaptcha still seems to work even if one of the requests (to www.google.com) failed.
  3. Make sure the recaptcha__<language_code>.js served is properly filtered. Take a look at the response in your browser, you should have replaced all www.google.com with yourdomain.com and www.gstatic.com with static.yourdomain.com. Searching those two strings should yield no result.
javier-reguillo commented 7 years ago

Hello, How do I do this in IIS? Thanks

DarwinSilva commented 7 years ago

Hi, I stay trying make it on my Apache server but the recaptcha is not working Can you put one example where this stay running? Thanks

zypA13510 commented 7 years ago

@DarwinSilva I use this on my Wordpress server with a Recaptcha plugin. And it seems to work in China without being blocked. (One request still points to google.com and failed of course, but Recaptcha works nevertheless)

joinso commented 7 years ago

Hi @zypA13510 !

I followed your steps but doesn't work. The first call to the page where Google Recaptcha is present, makes the "SUBSTITUTE" without problems. So

changes to

That's ok.

However in the call "https://mydomain.com/recaptcha/api.js?hl=es", the SUBSTITUTE doesn't work. I still seeing "www.gstatic.com" inside.

So, it seems that the SUBSTITUTE doesn't work on proxypass ....

Any idea?

Regards, JOINSO

joinso commented 7 years ago

Hi! Solved: misconfiguration on Apache.

Augustin-FL commented 7 years ago

Hello,

i tried to apply the idea of @zypA13510 , however i don't have full control to the virtualhosts. but, i have mod_rewrite enabled on my apache. So i wrote a small PHP wrapper, and a .htaccess :

.htaccess file

RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /recaptcha/index.php [L] 
#change to RewriteRule . /index.php [L] for static.yourwebsite.com

index.php file

<?php
/*
    simple reverse proxy for google reCaptcha in PHP :
    - if visitor is on "website.com/recaptcha/xxxxx" , then this script get "google.com/recaptcha/xxxx"
    - if visitor is on "static.website.com/xxxx", then this script get "gstatic.com/xxxx"

*/

$proxy="";//if the server you are on, need any proxy to go to the internet

// -- STEP 1 : decide if the visitor is on static.website.com or website.com/recaptcha. Also, get the domain name and the uri entered.

$uri=$_SERVER['REQUEST_URI'];
$host=$_SERVER["HTTP_HOST"];

if(strpos($host,'static.')===0) 
{
    $host=substr($host,strlen('static.'));
    $domain='https://www.gstatic.com/';
}
else
{
    $domain='https://www.google.com/';
}

// ---- Step 2 : We make the request (with curl)

$curl=curl_init($domain.$uri);
curl_setopt($curl, CURLOPT_PROXY, $proxy);// IMPORTANT : we need to enter proxy
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_TIMEOUT, 10); 
if(!empty($_SERVER['HTTP_COOKIE'])) curl_setopt($curl, CURLOPT_HTTPHEADER, array("Cookie: ".$_SERVER['HTTP_COOKIE']));
curl_setopt($curl, CURLOPT_HEADERFUNCTION, "curlResponseHeaderCallback");

if(!empty($_POST))// if we received any POST data, we send it with the request
{   
    $post="";
    foreach ($_POST AS $key=>$value)  $post .= $key.'='.$value.'&'; 
    $post = rtrim($post, '&'); 

    curl_setopt($curl, CURLOPT_POST, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS,$post);
}

$response = curl_exec($curl);

// Step 3 : we get the request and replace all references to google.com by website.com
// and all references to gstatic.com by static.website.com

if($response!=false)
{   
    $response = str_replace("www.google.com", $host,$response);
    $response = str_replace("www.gstatic.com", "static.".$host,$response);
    $response = str_replace("https://", "http://",$response);

    echo $response;
}
else
{
    echo "/*error : could not get recaptcha*/";
}

//step 3 (bis) : we display the content type returned by the request
//and we also replace the cookies
function curlResponseHeaderCallback($curl, $headerLine)
 {  
    if (strpos($headerLine,'Set-Cookie:') === 0)
    {
        $headerLine = str_replace("www.google.com", $host,$headerLine);
        $headerLine = str_replace("www.gstatic.com", "static.".$host,$headerLine);
        header($headerLine);
    }
    else if (strpos($headerLine,'Content-Type:') === 0)
    {
        header($headerLine);
    }

    return strlen($headerLine); // Needed by curl
}

?>

Here is how to use it :

(yes, you need to copy each file in two location. You will have 4 files at the end)

Just to be clear : this is NOT the best way at all to proxy recaptcha (create a reverse-proxy at apache level is clearly better), however sometimes there is no other choice than doing this

stath715 commented 6 years ago

@zypA13510

It seems that a lot of google domain are also accessed: www.google.com,www.gstatic.com,support.google.com,developers.google.com,fonts.gstatic.com

I am stuck with https://www.google.com/js/bg/d--b7FVIhvCFHkmSrkgO9rhjbdCimjBfDEqJIwYWYPc.js initiated by recaptche__en.js in which I can't see any "www.google.com" reference.

It's like if that url was built by a js function.

Do you have any update about your reverse proxy solution?

Tkx

zypA13510 commented 6 years ago

@stath715

Yes, it is as you said. Requests built from sources other than plain text (e.g. from a binary stream) cannot be detected by SUBSTITUTE. But I tested my solution on a client computer that never visited Google nor used any VPN. Despite a few requests failed, it worked nevertheless and I was able to click the right images to get pass recaptcha (at least at the time of my previous post). However, I have not tested it recently, so I'm afraid I can't help you further. Sorry.

One thing about the Chinese Great Firewall is, it is ever-changing and evolving. And of course, its behavior is not, and will never be documented. In other words, it will never be easy, trying to grant access to a website that is not meant to be accessible. :wink:

If you find a better solution, you can share it to help more (or I don't mind updating my comment). Good luck.

joinso commented 6 years ago

Hi!

Here is my solution that works.

Create two conf on Apache and replace WWW.YOURDOMAIN.COM and YOURDOMAIN.COM with your domain.

1) /etc/httpd/conf.d/mysite.conf:

<VirtualHost :80>
ServerName WWW.YOURDOMAIN.COM:80
DocumentRoot /var/www/html ProxyRequests Off
SSLProxyEngine On SSLProxyVerify none SSLProxyCheckPeerCN off SSLProxyCheckPeerName off SSLProxyCheckPeerExpire off
ProxyVia On ProxyPreserveHost Off GeoIPEnable On GeoIPScanProxyHeaders On GeoIPDBFile /usr/share/GeoIP/GeoIP.dat GeoIPDBFile /usr/share/GeoIP/GeoIPCity.dat GeoIPDBFile /usr/share/GeoIP/GeoIPASNum.dat
<Proxy
> Order deny,allow Allow from all

  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:GEOIP_COUNTRY_CODE} in { 'CN' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/WWW.YOURDOMAIN.COM/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.COM/ni"
    </If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "WWW.YOURDOMAIN.COM"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.COM"

2) /etc/httpd/conf.d/gstatic.conf:

  <VirtualHost *:80>
  ServerName gstatic.YOURDOMAIN.COM:80  
  SSLProxyEngine On
  ProxyVia On
  ProxyRequests Off
  ProxyPreserveHost Off
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>      
  ProxyPass "/" "https://www.gstatic.com/"
  ProxyPassReverse "/" "https://www.gstatic.com/"      
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.COM/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.COM/ni"
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "WWW.YOURDOMAIN.COM"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.COM"
  </VirtualHost>
ykuz commented 6 years ago

Hello,

does someone has solution for nginx?

rehfeldchris commented 6 years ago

@zypA13510

I had trouble with the config you posted on jan 30 until I took a clue from @joinso - it's important to add RequestHeader unset Accept-Encoding otherwise the Substitute ... was not working for me. I was using apache 2.4.29, in case the substitute behavior has changed at some point. My guess is that the substitute didn't work because the response was gzip'd, and stripping the header avoids that scenario.

rehfeldchris commented 6 years ago

So, now that I implemented this clever hack, I'd like to warn others that it doesn't work very well.

1) If I click the button to request an audio version of the captcha, it refuses, telling me "Your computer or network may be sending automated queries. To protect our users, we can't process your request right now. For more details visit our help page" 2) When I solve the captcha the normal way (clicking images with street signs etc...) it makes me click much more than usual. I feels like about 15-25 image clicks before it will be satisfied. 3) The links for "Privacy" and "Terms" get rewritten to your domain, but the ProxyPass "/recaptcha" config ensures only urls that start with /recaptcha are actually proxied to google.com, and so these links fail. This could be fixed easily, but the previous 2 problems are probably very difficult or just not solvable, making it a moot point.

Equim-chan commented 6 years ago

You may try recaptcha.net. It's offical and accessible from China. Just change

https://www.google.com/recaptcha/api.js?render=explicit

to

https://recaptcha.net/recaptcha/api.js?render=explicit

in front end, and

https://www.google.com/recaptcha/api/siteverify

to

https://recaptcha.net/recaptcha/api/siteverify

in back end, and it should work as expected.

rehfeldchris commented 6 years ago

So, I tried using recaptcha.net, but I see it still loads 1 asset from google.com

eg, the url that starts with: https://www.google.com/recaptcha/api2/anchor...

I tried putting this url into the "test url" tab of https://en.greatfire.org/analyzer but it said it failed to load

Is there any official comment from google on this?

sdemjanenko commented 6 years ago

I was able to get this to work.

In the page I added:

 <script>
  window['__recaptcha_api'] = "https://my.recaptcha.proxy.com/recaptcha_proxy/google_com/";
 </script>
 <script type="text/javascript" src="https://my.recaptcha.proxy.com/recaptcha_proxy/google_com/api.js"></script>

In Nginx I added (note: this is ERB which compiles to Nginx config):

location /recaptcha_proxy/#{proxy_path}/api.js {
  proxy_redirect off;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header Host "www.#{domain}";
  proxy_set_header User-Agent $http_user_agent;
  proxy_set_header Referer $http_referer;
  proxy_cookie_domain "#{domain}" $host;
  proxy_set_header Accept-Encoding "";
  sub_filter_once off;
  sub_filter_types text/css text/html text/javascript;
  sub_filter "www.gstatic.com/recaptcha" "$http_host/gstatic_proxy/recaptcha";
  sub_filter "fonts.gstatic.com/" "$http_host/gstatic_fonts_proxy/";
  sub_filter "recaptcha.anchor.Main.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.anchor.Main.init(";
  sub_filter "recaptcha.anchor.ErrorMain.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.anchor.ErrorMain.init(";
  sub_filter "recaptcha.frame.Main.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.frame.Main.init(";
  sub_filter "recaptcha.frame.ErrorMain.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.frame.ErrorMain.init(";
  sub_filter "importScripts(" "this.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nimportScripts(";
  proxy_pass https://www.#{domain}/recaptcha/api2/$1$is_args$args;
}

location ~* ^/recaptcha_proxy/#{proxy_path}/api2/(.+)$ {
  ... same as above
  proxy_pass https://www.#{domain}/recaptcha/api2/$1$is_args$args;
}

location ~* ^/gstatic_proxy/recaptcha/(.+)$ {
  proxy_pass https://www.gstatic.com/recaptcha/$1$is_args$args;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header User-Agent $http_user_agent;
  proxy_set_header Referer $http_referer;
  proxy_set_header Accept-Encoding "";
  sub_filter_once off;
  sub_filter_types text/css text/html text/javascript;
  sub_filter "www.gstatic.com/recaptcha" "$http_host/gstatic_proxy/recaptcha";
  sub_filter "fonts.gstatic.com/" "$http_host/gstatic_fonts_proxy/";
  sub_filter "www.gstatic.c..?" "$http_host\\/gstatic_proxy";
  sub_filter "/recaptcha/api2/" "";
}

location ~* ^/gstatic_fonts_proxy/(.+)$ {
  proxy_pass https://fonts.gstatic.com/$1$is_args$args;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header User-Agent $http_user_agent;
  proxy_set_header Referer $http_referer;
}
msegzda commented 6 years ago

Is this workaround officially "supported"?

Augustin-FL commented 6 years ago

@msegzda "officially"? there is not such concept in China.

You need to understand something : when chinese government decide do ban websites, officials just say to chinese ISP "please ban Google", without any clear explanation of what "Google" is.

Because of this, chinese ISP do ban websites depending on their interpretation / rendition, which is sometimes subjective. The two main ISP of China (China unicom & china telecom) don't use the same ban list and/or ban depending on the area you are in....so basically, a website could be sometimes accessible and sometimes not, in a chaos which is so representative of China.

msegzda commented 6 years ago

@Augustin-FL thanks for reply. Let me clarify. I don't care about China, its probably wrong place but what I'm asking is what's Google's official standing on community using these reverse-proxy workarounds? As you can see there is number of complex substitutions on Nginx or Apache filters happening in order to remove and replace any Google servers with own domains and endpoints. If reCaptcha sources changes in those parts all of the implementations blows up! So what I want to know - does Google support or back the community doing this workaround and are they + community careful about changing the code that can potentially break all the zillions of websites in China using reCaptcha?

SimonVillage commented 6 years ago

@sdemjanenko would you mind to update your nginx snippet? Seems like api2 is not longer available?

rowan-m commented 6 years ago

I'm updating the client on the v1.2 branch to allow you to set an arbitrary URL for the siteverify call which may help in testing environments or other situations.

rowan-m commented 6 years ago

Reliability of results and user experience if you're going through a proxy is really out of scope for this repo. I've updated the code to allow for setting of an explicit URL and I'm happy to take PRs that add RequestMethods for better working within a proxy.

eyeinsky commented 6 years ago

@zypA13510 do you do the yourdomain.com/static.yourdomain.com separation simply to avoid filename clashes? I.e if one is feeling lucky and there are no name clashes then one could serve a bunch of domains from within yourdomain.com/recaptcha?

zypA13510 commented 6 years ago

@eyeinsky

do you do the yourdomain.com/static.yourdomain.com separation simply to avoid filename clashes?

Another reason is that rewriting path (the part after hostname) in reverse proxy is very troublesome and tends to have undesired result

if one is feeling lucky and there are no name clashes then one could serve a bunch of domains from within yourdomain.com/recaptcha?

good luck with that. But personally, I don't think it's the right way to go (unless you are really limited to one domain only and have no other choice)

rwat090 commented 6 years ago

Hi Everyone,

I have written a reverse proxy solution for our China customers but the problem is the solution is unstable, for example, we have the following issue

It takes a user about 8 to 9 verify requests before the users response is accepted within the Recaptcha interface.

I receive the following response for the POST to "/recaptcha/api2/userverify?k=xxxx"

["uvresp”,xxxxx”,,0,null,null,null,null,["rresp","03xxxxx,null,120,["pmeta",["/m/01bjv",null,3,3,3,null,"bus",[] ] ,null,[1,3000] ]"dynamic",null,["bgdata”,xxxx”]]

If I use google the domain I receive a successful ReCaptcha response within 3 attempts

["uvresp","03xxxxx”,1,120]

Basic Curl POST to Google reCAPTCHA Domain (Bypassing Proxy)

["uvresp",null,null,null,1]

Im trying to understand the issue with the response, im wondering if its related to the session, does anyone have any tips or advice how to decode the response to confirm the issue ?

batou-mtcapthca commented 5 years ago

One can also consider using another captcha service as fallback when reCaptcha fails to load. Here is an example by MTCaptcha: https://www.mtcaptcha.com/faq-recaptcha-fallback-mtcaptcha. MTCaptcha is not free though, it does have a relatively cheaper plan if only need to support traffic in China.

Full transparency, I work for MTCaptcha, and its an awesome service :-)

joinso commented 4 years ago

Hi!

I posted a new version of my solution. It does not work, but perhaps someone can help us like @rwat090 .

In /etc/httpd/conf.d/yourdomain.conf:

  <VirtualHost *:80>        
  ServerName www.YOURDOMAIN.com:80  
  DocumentRoot /var/www/html
  ProxyRequests Off      
  SSLProxyEngine On
  SSLProxyVerify none 
  SSLProxyCheckPeerCN off
  SSLProxyCheckPeerName off
  SSLProxyCheckPeerExpire off      
  ProxyVia On
  ProxyPreserveHost Off      
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>
  ProxyPass /s3fs-css/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-css/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPass /s3fs-js/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-js/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPass /s3fs-images/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-images/ https://static.YOURDOMAIN.com/s3fs-public/

  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:COUNTRY_CODE} in { 'CN','HK' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
    </If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

In ssl.conf:

  <VirtualHost *:443>
  ServerName www.YOURDOMAINcom
  ProxyRequests Off      
  SSLProxyEngine On
  SSLProxyVerify none 
  SSLProxyCheckPeerCN off
  SSLProxyCheckPeerName off
  SSLProxyCheckPeerExpire off      
  ProxyVia On
  ProxyPreserveHost Off
  MaxMindDBEnable On   

  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>

  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:COUNTRY_CODE} in { 'CN','HK' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
    </If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"              
  </VirtualHost>

  <VirtualHost *:443>
  ServerName gstatic.YOURDOMAIN.com
  ErrorLog logs/gstatic.com.error_log
  TransferLog logs/gstatic.com.access_log
  SSLProxyEngine On
  ProxyVia On
  ProxyRequests Off
  ProxyPreserveHost Off
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>      
  ProxyPass "/" "https://www.gstatic.com/"
  ProxyPassReverse "/" "https://www.gstatic.com/"      
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

In /etc/httpd/conf.d/maxmind_geolite2.conf:

  LoadModule maxminddb_module /usr/lib64/httpd/modules/mod_maxminddb.so    
  <IfModule mod_maxminddb.c>
    MaxMindDBEnable On
    MaxMindDBFile ASN_DB /usr/share/GeoIP/GeoLite2-ASN.mmdb
    MaxMindDBEnv MM_ASN ASN_DB/autonomous_system_number
    MaxMindDBEnv MM_ASORG ASN_DB/autonomous_system_organization

    MaxMindDBFile CITY_DB /usr/share/GeoIP/GeoLite2-City.mmdb
    MaxMindDBEnv MM_COUNTRY_CODE CITY_DB/country/iso_code
    MaxMindDBEnv MM_COUNTRY_NAME CITY_DB/country/names/en
    MaxMindDBEnv MM_CITY_NAME CITY_DB/city/names/en
    MaxMindDBEnv MM_LONGITUDE CITY_DB/location/longitude
    MaxMindDBEnv MM_LATITUDE CITY_DB/location/latitude    
    MaxMindDBEnv MM_REGION_CODE  CITY_DB/subdivisions/0/iso_code        

    MaxMindDBFile COUNTRY_DB /usr/share/GeoIP/GeoLite2-Country.mmdb
    MaxMindDBEnv COUNTRY_CODE COUNTRY_DB/country/iso_code  
  </IfModule>

In /etc/httpd/conf.d/gstatic.conf:

  <VirtualHost *:80>
  ServerName gstatic.YOURDOMAIN.com:80  
  SSLProxyEngine On
  ProxyVia On
  ProxyRequests Off
  ProxyPreserveHost Off
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>      
  ProxyPass "/" "https://www.gstatic.com/"
  ProxyPassReverse "/" "https://www.gstatic.com/"      
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

If your are behind a Proxy: In /etc/httpd/conf.d/remoteip.conf:

  LoadModule remoteip_module modules/mod_remoteip.so
  RemoteIPHeader X-Forwarded-For

As @rwat090 says, the reCatpcha works, load all images, but it takes about 10 test to pass it. However when submit, says that the recaptcha is wrong.

Regards, JOINSO

blinkybill commented 4 years ago

@JOINSO. Maybe it's working fine, and by the tenth pass its simply failing due to a timeout? (from memory the recaptcha client response code needs to be validated server side within 2 minutes or something)

@rehfeldchris the original post is a couple years old, but I'd be expecting that Google does this auto switching of dependant JS files out of the box depending on where the request comes from?, otherwise the "recaptcha.net" domain they've offered as an alternative for "International" use cases would be pointless. E.g. for supporting someone in China I don't think Google engineers would be silly enough to ask us on their official site to load the first script via "recaptcha.net", and simply have all subsequent dependant files still loading from "google.com".

somireddysathiDB commented 4 years ago

Hi,

We are using ALB as the client facing and have Apache reverse proxy in between ALB and application server. Implemented the same solution in Apache reverse proxy without any virtual host. getting js file If I try to access the js file from browser as "http:///recaptcha/api.js". I am getting 404 error when I tried from browser as "https:///recaptcha/api.js". I verified the Apache reverse proxy logs, connection is established with google but getting 404 error from google somehow. Can you please share your thoughts what is going wrong.

somireddysathiDB commented 4 years ago

Hi,

Can someone throw light on the above issue. appreciate your inputs.

swetalina-orangescrum commented 1 year ago

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

https://www.gstatic.com Could anyone explain why this url is not opening after setting proxy url in server??