iamamac / autoproxy2pac

Bypass GFW's blockade on almost every browser
http://autoproxy2pac.appspot.com/
87 stars 14 forks source link

Set up some mirror sites, so that the online PAC url may be harder to block #6

Closed iamamac closed 14 years ago

iamamac commented 15 years ago

https://autoproxy2pac.appspot.com is blocked in some area, which makes the auto-updating version of PAC is not usable. One way to solve this problem is to set up some mirror sites, and hope GFW won't block them all or GAE itself.

UPDATE on 2010-1-18

Some new thought to make the auto-updating PAC URL accessible:

  1. Anyone can put up a web-proxy-like mirror on his/her overseas web host.

    A mirror (eg. http://pac.autoproxy.org/tor ) will forward the user request (especially User-Agent, If-Modified-Since and If-None-Match in HTTP header) to http://autoproxy2pac.appspot.com/pac/tor and return the response.

  2. By making use of some online web proxies, generate additional PAC link, besides the appspot one. for example, http://proxysurfatwork.info/.i/851697Oi8vYXV0b3Byb3h5MnBhYy5hcHBzcG90LmNvbS9wYWMvdG9y

    The downside is that the URL is not permanent because the proxy may be blocked and we have to switch to a new one. Furthermore, a web proxy may not be cache-friendly, so that our outgoing bandwidth quota will consume quickly.

  3. Automatically upload PAC file to Dropbox whenever gfwlist is updated.

    The downside is that it is difficult to maintain different version of PAC for each proxy tool, especially when users can customize the proxy URL and port, even the rule itself! Also, Dropbox team haven't provide API yet.

  4. [Just an idea] Try to make use of the local HTTP proxy (the one actually use when visiting blocked sites).

    This is the request sent to a HTTP proxy (some HTTP request header omitted): GET http://autoproxy2pac.appspot.com/pac/tor HTTP/1.1 Host: autoproxy2pac.appsot.com

    Maybe there is a way to trick the web browser to mimic the request, so that web browser will retrieve the PAC file through proxy instead of direct connection. This is what I have achieved so far:

    Add "127.0.0.1 autoproxy2pac.appsot.com" to hosts file, then visit http://autoproxy2pac.appspot.com:8118/pac/tor (suppose the proxy port is 8118). The HTTP request will be GET /pac/tor HTTP/1.1 Host: autoproxy2pac.appsot.com

    It does not work yet.

UniIsland commented 14 years ago

do you still want them to be on appspot.com or prefer different domain names

lovelywcm commented 14 years ago

Would you like to be mirrored by autoproxy.org?

iamamac commented 14 years ago

@lovelywcm The code is written for Google Appengine, which means it is not easy to host the whole site on other architectures, like LAMP. However, it is possible to mirror the core function of this site, i.e. the auto-updating PAC file, in the fashion like a web proxy.

UniIsland commented 14 years ago

actually, i think, it's not necessary to host the whole thing on the mirror sites. merely setting up a portal to the PAC files is good enough. moreover, no keywords that trigger the gfw should appear on the appspot page. using a disposable mirror site is the recommended way to display potential gfw keywords, such as 自由门, Tor, etc. Thus make the main site accessible only thru API once the mirror sites are created. if i have time, i'll make a simple mirror with the basic functions on one of my domains, in a week. today is the first day of my winter break, so am still quite busy.

iamamac commented 14 years ago

@UniIsland You can not access to AutoProxy2PAC because the whole appspot.com is blocked rather than a few sensitive keywords on the page. (Actually there is no content filtering for appspot.com yet) Moreover, keywords that help users choose the appropriate PAC file will do no harm since the site is already blocked.

In my opinion, the scenario of mirror site should look like this:

  1. A user get his/her PAC url from autoproxy2pac.appspot.com, maybe through proxy (which can also ensure he/she has a proxy software installed). He/she will be able to choose the proxy software, and add some custom proxy rules (will be available in the future).
  2. The AutoProxy2PAC site will return some mirror URL in addition to the appspot one. Or the user can easily build a mirror URL by changing the domain (if the mirror follows the same path convention).

In this way, mirrors will keep low profile to remain useful for long.

BTW, when you implement the mirror, please keep in mind to make it cache-friendly. Thanks for your contribution in advance :)

UniIsland commented 14 years ago

@iamamac, nope, subdomains of appspot.com are not blocked. try visit autoproxy2pac.appspot.com without a proxy yourself.

iamamac commented 14 years ago

@UniIsland At least in TUNET (Tsinghua CERNET), all appspot sites are DNS hijacked. I have to set a hosts entry to use the auto-updating PAC myself.

UniIsland commented 14 years ago

@iamama, i see. i'm not blocked neither in beida (cernet) nor at home (歌华). this is a minor issue, on which we can discuss later.

UniIsland commented 14 years ago

i wrote a simple mirror page. check it out. http://xuejian.info/autoproxy.html it's small and easy to implement.

iamamac commented 14 years ago

@UniIsland Great work, but it seems not to be cache-friendly yet? Moreover, we just found an easy solution: sign up for Google Apps and bind autoproxy2pac under your domain. Check this out: http://www.gfwlist.tk and http://pac.autoproxy.org The only drawback is HTTPS is not supported for now, but that's not a big concern.

UniIsland commented 14 years ago

i modified the header part to make it cache-friendly. and it supports https natively if the server can be accessed thru https. what else do we need? please post to the issue page of the project. http://github.com/UniIsland/autoproxy2pac-portal/issues

UniIsland commented 14 years ago

using google apps is a good workaround. it's just a dns cname record hack, right? there's still the risk that appspot is block by it's ip address, though very unlikely to happen in the near future.

iamamac commented 14 years ago

@UniIsland Just setting the Cache-Control HTTP header is not enough, If-Modified-Since, If-None-Match, and the 304 response must be handled. Besides the Google Apps solution, we can do url-rewrite in apache/nginx or set up a simple forward-all mirror (In case you haven't read the discussion in the gfwlist mail-list, refer to http://code.google.com/p/php-dynamic-mirror/ ) Perhaps there is no need to write a portal for autoproxy2pac specifically.

UniIsland commented 14 years ago

that's good. so i can stop. i know the php-dynamic-mirror project. it's surely a convenient way to do this. the portal wasn't much work for me. and we can just leave it there.

iamamac commented 14 years ago

@UniIsland Sorry to hear that. But you are welcomed to join the development of autoproxy2pac, especially a redesign of the web interface.