net4people / bbs

Forum for discussing Internet censorship circumvention
3.48k stars 82 forks source link

Active probing in Iran(Investigation) | استفاده از کاوش فعال در ایران #183

Open legitYosal opened 1 year ago

legitYosal commented 1 year ago

By opening this issue I intend to achieve following goals:

  1. Gathering credible and textual evidence of using active probing methods by Islamic republic
  2. Finding a general solution(presenting tools, scripts, precedures) to monitor and gather these evidence from volunteers
  3. Exposing Active prober general algorithm and analysing its methods and precedures
  4. Finding best solutions to camouflage با این ایشیو قرار است به هدف‌های زیر برسیم: ۱. جمع کردن اطلاعات و شواهد قطعی از استفاده اکتیو پروبینگ یا سیخ زدن فعالانه توسط جمهوری اسلامی ۲. اراعه روش‌های مختلف و داوطلبانه برای بررسی و مانیتور لحظه‌ای سرور‌ها برای پیدا کردن شواهد استفاده اکتیو پروبینگ ۳. شناسایی الگوریتم اکتیو پروبر ایرانی و آنالیز روش کار کردن آن ۴. پیدا کردن بهترین روش‌ها برای دور زدن و رهایی اکتیو پروبر ایرانی

As I have heard about IRI is using active probing and IPs being blocked without actually seeing it I have decided to open this issue.
IPs may have been black listed before for example IP:1.2.3.0, when creating a proxy server on this IP it actually will not work, but using same datacenter with a different IP will work, the solution to finding out if it is black listed is to open a https server on designated port and try to connect to it from within Iran.(not confirmed) در مورد بلاک شدن آیپی‌ها شنیدم و بنظرم ممکنه که بحث کردن روش بهمون کمک کنه. آیپی‌هایی هستن که بلاک شدن و پروکسی‌ کردن روشون نتیجه‌ای نمیده در حالی که توی همون دیتاسنتر با یک آیپی دیگه میشه پروکسی رو راه انداخت، برای پیدا کردن این آیپی‌ها میشه قبلش یک سرور اچ‌تی‌تی‌پی‌اس روی پورت اصلیش بالا آورد و دید که از توی ایران قابل دسترسی هست یا نه.(این تایید نشده است)

First step is to find out if this is even happening or not, I think we can find out from access logs with repeated IPs, for example IP1=1.2.3.0 is trying to SSH login or Nginx access on servers not owned by one person or totally apart servers. We can simply comment a list named SUS_IPS here on this Issue to initially gather our facts about active probing happening. اولین قدم باید پیدا کردن جواب این سوال باشه که آیا اصلا سیخ زدن فعالانه در حال انجام شدن هست یا نه؟ ما میتونیم با بررسی لاگ‌های تکرار شده در سرور‌های مختلف این واقعیت را کشف کنیم، به عنوان مثال آیپی ۱.۲.۳.۰ در تعداد قابل توجهی سرور که توسط یک نفر اداره نمیشن برای لاگین شدن یا در لاگ‌های اکسس انجین‌ایکس مشاهده شده باشه. برای این کار به راحتی میتونیم لیستی از آیپی‌های مشکوکی که مشاهده کردیم در همین ایشیو با کلید SUS_IPS کامنت کنیم و به اشتراک بگذاریم.

We can use a simple script that runs on a server collects access IPs from different sources and then pick sus ips from the output, for example fro Ubuntu you can read IPs from sshd logs using python:

import re
import whois # python3 -m pip install python-whois

if __name__ == '__main__':
    ALL_IPS = []
    with open('t.log', 'r') as logs:
        for line in logs.readlines():
            m = re.search(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",line)
            if m and m.group(0) not in ALL_IPS:
                ALL_IPS.append(m.group(0))
    # with open ('/var/log/nginx/access.log', 'r') as logs

    for ip in ALL_IPS:
        who = whois.whois(ip)
        print('*** IP: ', ip)
        print(who.get('organization'), ' '.join([i for i in (who['emails'] or [])]))
arandomgstring commented 1 year ago

@usefss

Let's assume the worst, and consider a scenario where they are using active probes, something similar to China's. Now, do you think in "theory", there is any method better than running nginx/caddy/haproxy, etc on proxy server? Something that Chinese currently do? In fact, their method expose your server to everyone, therefore in log file you will see many IPs, so it's not suited for a test, as per se, but any "active probe" will see a simple website running at that server. Denying active probes doesn't look like a good idea either. They might block any IP that deny their probes, for example.

My intention of saying this is because we already have the best method to defeat active probes, is there any need to invest time on gathering such logs?

legitYosal commented 1 year ago

@arandomgstring Yep that concludes it, but it will be never a bad idea to collect knowledge and data from what is unknown, having a Nginx infront does a little actually what if it bypasses when having persian in that website? who knows right? or vmess and vless blocking in iran never happended in china so are the prober sames in both? or is there any data on actually this nginx solution works? And it was interesting to know the mere fact of Iran using active probing and having such man power, and trying to locate these organizations and locations... And finally it will be a matter of a 1 or 2 years that the prober will evolve and get more inteligent so by knowing it more transparently we could have more efficient solutions... So I will close this issue and hope in future there will be some real researchers working on this matter...

wkrp commented 1 year ago

@legitYosal this is a valuable topic to discuss. As you say, the idea behind testing for active probing is easy: set up a server on a fresh IP address, connect to it yourself, and see if any other IP addresses then connect to the server. It's important to be systematic, keep good records, and only change one variable at a time.

Here are some source code repositories from past projects on active probing.

sh4run commented 1 year ago

Please allow me to chime in. My understanding about active probing is, active probes are likely indications that this IP(server) is being watched. The key point to pass through national firewall is not to get its attention instead of how to react or handle active probes properly.

How to not get firewall's attention? Firewall's detection methods that I can think of might include:

  1. packet length pattern match. This should be easiest method to be enforced as it doesn't require DPI.
  2. Byte pattern match. This requires DPI to scan the entire packet content. DPI is expensive. To deploy a new DPI rule is difficult and slow.
  3. Traffic measurement. This is more complex. I guess the scale for one firewall to deploy such a monitor might be limited.
  4. Some kind of AI or HI(human intelligence) analysis. I guess this might be the most expensive way.

The more expensive to detect, to more chance for one technology to survive.

My 0.02.

arandomgstring commented 1 year ago

@legitYosal

@arandomgstring Yep that concludes it, but it will be never a bad idea to collect knowledge and data from what is unknown,

Collecting data is always a good idea, but without enough motivation, I don't think that many people participate in such data collections, that's what I meant. What are motivations for such data collection?. However,

having a Nginx infront does a little actually what if it bypasses when having persian in that website? who knows right

I didn't understand that sentence. Can you rephrase it?

And finally it will be a matter of a 1 or 2 years that the prober will evolve and get more inteligent so by knowing it more transparently we could have more efficient solutions...

If we are talking about Iran, I really "hope" so. Because the way I see it, unlike China, government doesn't care much about economic, people's needs, etc. Therefore, I predict in near future, instead of dealing with active probe, they will use a whitelist for certain IPs, or make the Internet accessible to certain people. That would be much better for them cost wise.

@sh4run

From what I know, "active probes" are actually some packets generated by firewall that are sent to an IP/domain to see if that IP/domain reacts as a proxy server or not. Other methods you mentioned are usually separate.

wkrp commented 1 year ago

From what I know, "active probes" are actually some packets generated by firewall that are sent to an IP/domain to see if that IP/domain reacts as a proxy server or not. Other methods you mentioned are usually separate.

Even though they work differently, it still makes sense to think of passive detection and active probing together. This is because (if past experience in China is a guide), they are used as two elements in a pipeline or chain. First, passive detection is used to identify servers that may be proxy servers, possibly with only a low degree of confidence. Second, active probes are sent to the server to confirm the guess.

What is the reason for a two-step process? My interpretation has always been that it is a way to reduce the false positives of passive detection alone; or, alternatively, to avoid the high cost of active detection alone. The passive detection algorithm is cheap but may be imprecise: it may have too many false positives to be usable as blocking filter directly. Active probing, on the other hand, is highly precise, but expensive. Therefore passive techniques are used first, to reduce the total number of candidate proxy servers; then those servers are active-probed, and the active probing results are used to make IP blocking rules.

My understanding about active probing is, active probes are likely indications that this IP(server) is being watched. The key point to pass through national firewall is not to get its attention instead of how to react or handle active probes properly.

I think it's both. It's an AND relation. A server gets blocked only when (1) its traffic is passively detected AND (2) it responds to a certain way to active probing. Breaking either link in the chain is sufficient to avoid blocking (or at least that's what we have seen in China). You can either evade passive detection, and therefore no active probes are even sent; or if you do get passively detected, you can respond to active probes in a way that doesn't cause the firewall to think you are a proxy server.

Some background is "A practical guide to defend against the GFW's latest active probing". The firewall in China was detecting Shadowsocks servers using a combination of passive detection and active probing. Servers that adopted the hardening recommendations continued to be active-probed, but no longer got blocked.

@arandomgstring has a good point that with the government of Iran currently in a crisis, its tolerance for collateral damage may be different from usual or different from the government's of China.

I hasten to add that all this discussion is still speculative in the case of Iran. So far, I have seen no evidence that active probing is used in Iran. Because the active probing experiment is so easy to perform, I would guess that if active probing were happening, it would have been noticed by now. But it would be good to have it actually documented, yes or no.

sh4run commented 1 year ago

I started from this article. Then I came out an idea to filter out incoming connections based on IP geolocation. I added it into Shadowsocks and brought it online (https://github.com/sh4run/sss#ip-geolocation-based-filtering). The new feature worked well as expected. All active probes were identified and filtered (Connection was accepted but following in-comings were silently dropped.). Unfortunately, after half a month my testbed was still blocked (ip-block) by GFW. Therefore I reached my conclusion that the key point is to not get GFW's attention.

Some background is "A practical guide to defend against the GFW's latest active probing". The firewall in China was detecting Shadowsocks servers using a combination of passive detection and active probing. Servers that adopted the hardening recommendations continued to be active-probed, but no longer got blocked.

My tests also show, when a new shadowsocks server comes online, GFW probes were received immediately after the first connection was initiated. This means GFW can identify shadowsocks precisely, not by any traffic measurement, or any mysterious big data analysis, but by some characteristics of shadowsocks itself.

I am not sure what happened in Iran. But I guess before sending out active probes, the firewall must have done some passive scanning on the traffic.

kyumath commented 1 year ago

we have some ideas to bypass active probing of GFW : (strong evidence shows iran use exactly the same filtering system as china -> GFW)

ideas 1: active probe comes from thousands of ips (see reference) so you should block all ips except your client ips your client need to click a "domain.com/?pass=x" to send its ip you verify client password then allow this ip in linux firewall remove that rule when client not use proxy anymore all others ip blocked it block active probing effectively

ideas 2: block ir sites and domains in your server with v2ray geoip:ir

ideas 3: run good-looking website on 80 & 443 to pretend that you are a normal web/ftp server as GFW use machine learning to identify vpns

ideas 4: limit your server traffic or change ports as well as tcp/udp times to times GFW trigger blocking if traffic is up for sometimes (~4 hours or so) so if your traffic switch between servers or between tcp/udp every hours it prevent passive detection

is it work? in some degree yes! we admit that idea1 prolong blocking of server but GFW use all available techniques and we need to do the same. yet our experiment is in progress

some reference to read more :

https://github.com/groundcat/Block-GFW-Active-Detection https://gfw.report/blog/gfw_shadowsocks/ https://www.opentech.fund/news/exposing-the-great-firewalls-dynamic-blocking-of-fully-encrypted-traffic/

Detecting Probe-resistant Proxies: https://pdfs.semanticscholar.org/0097/e2c7d8b3bab7db3e5d799c14b5e0b6c64fd5.pdf

The Great Firewall’s active probing circumvention technique with port knocking and SDN: https://aaltodoc.aalto.fi/bitstream/handle/123456789/102539/master_Liubinskii_Pavel_2021.pdf

arandomgstring commented 1 year ago

@kyumath

so you should block all ips except your client ips

That's impractical to say the least, because clients' IPs are dynamic, they change at random times. We cannot add a rule to firewall every single time that IP changes. Moreover, a middle man such as ISP, can use client IP on active probes too (in theory at least). In other words, ISP sends some packets with client's IP to server. Your firewall will let them go through, you know. The best way is idea 3 and the problem is, if censors inspect the Webserver and don't find a reasonable traffic from it, then again your IP will be blocked.

kyumath commented 1 year ago

@arandomgstring

i deploy idea 1 and its the most effective among others. (its called port knocking - see two last reference for more info) yes we did it in simple way: -give your client a unique link having password -> domain.com/?pass=12345 -when client open link you simply get its ip and add to ufw or iptable -censorman dose not have valid pass to register its ip -clients ip are dynamic and every time user want to connect need to open given link first -client ip just allowed for a single port and removed in 3-min when user have no connection so GFW cant use that ip for active probing of all ports. also GFW need sophisticated hardware/software and cant simply use man-in-the-middle attack i test it and confirm that it kill GFW active probing. but GFW passive traffic analysis is still an issue and its on our research.

arandomgstring commented 1 year ago

@kyumath

Interesting. In that case, the whole process can be automated (i.e user doesn't have to click on a link, a simple code does that for them, every time they are trying to connect to the server).

And yes, to my knowledge, the current GFW uses its own IPs to probe servers, however, it does not eliminate man in middle attack. As you can see in my previous post, I said "in theory" they can use the client's IP itself in their active probe. It is a bit hard though, since it can disrupt normal traffic of the user, unless they get the timing right, which is hard in networking, nevertheless, I think it is possible.

And another point. What if they decide to block a domain that active probes don't have access to, but a client has?

wkrp commented 1 year ago

Unfortunately, after half a month my testbed was still blocked (ip-block) by GFW. Therefore I reached my conclusion that the key point is to not get GFW's attention.

I expect that this is actually a different, newer phenomenon. Besides active probing Shadowsocks servers, the GFW started (November 2021) dynamically detecting and blocking connections whose bytes are highly random. There is not much yet to read about it, but it is still happening. The randomness detection does not require active probing.

@kyumath linked to the best available report so far: https://www.opentech.fund/news/exposing-the-great-firewalls-dynamic-blocking-of-fully-encrypted-traffic/ https://geneva.cs.umd.edu/posts/fully-encrypted-traffic/en/

There's also a thread here: #136.

We worked with other researchers to discover that the current GFW utilizes a number of different rules to identify fully encrypted protocols like Shadowsocks, VMess, and Obfs4. One of these rules takes advantage of the fact that the ratio of 0 bit to 1 bit in these encrypted flows is close to 1:1. Therefore, if we add more 0s or 1s to the encrypted traffic and then rearrange the bit sequence, we can achieve the goal of changing the original ratio feature to bypass detection and blocking.

The randomness-based detection is another thing to have to think about, but it is separate from active probing.

rezarms commented 1 year ago

@legitYosal from my experience , I have tried different providers DigiOcean, kamatera ,linode, hetzner,IBM,Oracle and etc and most of protocols don't work from beginning but some of them after 10 minutes working got blocked. I checked the IP and port weren't blocked because I could run a website on that port and open it from Iran. I tested those servers with shadowsosks, OpenVpn,V2ray,Vmess,Vless,Trojan,outline,Wireguard,OpenConnect,softether and etc and unfortunately none of them worked. It seems active probing is active for some providers and some regions because for example I ran a server in aws in one of regions(still didn't work in other regions) and I was able to make it work for OpenVpn,Outline,Shadowsocks and V2ray and it's been running for 2-3 weeks now and no issue whilst I'm writing(knock the wood). Another issue was in Oracle Cloud I had a server and it was working for a mobile provider Hamrah_Aval and for other provider IRANCEL didn't work .

The thing is weird for me is I had servers running in most of the providers and for most of them I couldn't connect to them from beginning so it wasn't active probing since nothing was running and IP and ports weren't blocked. So if DPI detects the protocol why is working for other provider for example in my case AWS specific region?Unless they have DPI and active probing for specific providers and regions!

wkrp commented 1 year ago

It seems active probing is active for some providers and some regions because

A server being blocked is not itself evidence of active probing. There are many other reasons why a server might get blocked: dynamic protocol detection, TLS fingerprinting, destination IP range, connection volume and lifetime, etc.

The government if Iran is known to use many blocking techniques. This thread is about investigating whether or not active probing is among those techniques. So far, I have not seen any evidence that active probing is used in Iran. The way to test it is to run a packet capture on the server, connect to the server as a client, and then check if the packet capture shows any other IP addresses (other than the known client) connecting to the server.

free-the-internet commented 1 year ago

It seems active probing is active for some providers and some regions because

A server being blocked is not itself evidence of active probing. There are many other reasons why a server might get blocked: dynamic protocol detection, TLS fingerprinting, destination IP range, connection volume and lifetime, etc.

The government if Iran is known to use many blocking techniques. This thread is about investigating whether or not active probing is among those techniques. So far, I have not seen any evidence that active probing is used in Iran. The way to test it is to run a packet capture on the server, connect to the server as a client, and then check if the packet capture shows any other IP addresses (other than the known client) connecting to the server.

Personally I saw other IPs from China, Russia and a few from amazon servers appeared in the logs of v2ray server. But I can not confirm their activity. As you said capture is needed.

kyumath commented 1 year ago

تجربه ام رو تا اینجا خلاصه بگم: فیلترینگ منطقه ای عمل میکنه هر شهر از یه مسیر میگیره و هر کدوم gfw جداگانه با اپراتور شدت بستن و نوع مسدودی تنظیم میشه ایرانسل عملا ملی شده انگار تمام پروتکل ها و ایپی ها مسدودن بجز یک whitelist هم پسیو انالیز داره هم اکتیو پروب ایپی سرور معمولی به محظ داشتن ترافیک مسدود میشه کل ایپی فرقی نمیکنه tcp یا udp با دامنه cloudflare یا ایپی مستقیم با هر پورت. قانون اینه ترافیک داری + از جای معروفی = بلاک چون ملت متعدد ایپی عوض میکنن تو هتزنر و امثالهم بسیاری از ایپی ها از قبل فیلتر شده ممکنه 10 تا ایپی بخرید تا یکی اش باز باشه یه راه مبارزه استفاده از سرور داخل و تانل زدن به خارجه یه راه مخفی کردن xray websocket پشت پراکسی nginx است که عمر سرور رو طولانی میکنه (درحال تست هستیم کدش رو در هفته اینده میزارم) یه راه مسدود کردن کل ایپی ها بجز مشتریان هست با ufw که عمر سرور رو بیشتر میکنه (باز هم ترافیک مهمه چون مسدودی پسیو هم رخ میده) یه راه ایلان ماسکه و وقتی در هفته های اینده نت خارج کلا قطع شد خیابون تنها گزینه ای است که برا مردم مظلوم باقی میزارن.


To summarize my experience so far: Regional filtering operates each city from one route, and each gfw is adjusted separately with the operator intensity and the type of closure. Irancell is practically nationalized as if all protocols and IPs have been distracted except for a Whitelist It has both passive and active probe The usual IP server is blocked by the traffic, the whole IP is not different. TCP or UDP with the cloudflare domain or direct IP with each port. The law is traffic + from the famous place = block Because many nation are changing the IPzan and the like, many I have been filtered May you buy 10 ip so one is open A way to fight the server and tunnel to the foreign One way to hide xray websocket behind the nginx proxies that prolongs the server life (we are testing my code in the next week) There is a way to block all IPs except customers with ufw, which extends the life of the server (still important traffic because the possibly suspension occurs) A way Elon Musk And when the net is completely cut off in the next weeks, the street is the only option for the oppressed.

rezarms commented 1 year ago

@kyumath I had the idea of nginx proxy but couldn’t make it work. It will be great to share the code .

kyumath commented 1 year ago

@kyumath I had the idea of nginx proxy but couldn’t make it work. It will be great to share the code .

https://github.com/GFW-knocker/gfw_resist_http_proxy

we are working on that test a few days ago its very promising but lots of work remain

rezarms commented 1 year ago

@kyumath I had the idea of nginx proxy but couldn’t make it work. It will be great to share the code .

https://github.com/GFW-knocker/gfw_resist_http_proxy

we are working on that test a few days ago its very promising but lots of work remain

Thanks

rezarms commented 1 year ago

@kyumath

after finishing your test could you also share your configuration for Nginx and v2ray? I'm more keen to test it with Irancel provider.

GibMeMyPacket commented 1 year ago

@kyumath I had the idea of nginx proxy but couldn’t make it work. It will be great to share the code .

https://github.com/GFW-knocker/gfw_resist_http_proxy

we are working on that test a few days ago its very promising but lots of work remain

I don't have any server and thus i am not able to test, but isn't that doing what actually HAProxy can? Path-based Routing with HAProxy - HAProxy Technologies

rezarms commented 1 year ago

@kyumath part of the script is not clear to me like below

b'GET /pub/firefox/releases/latest/win64/en-US/Firefox-Setup.exe/

I setup a vmess with tcp and nginx it seems the proxy only redirects to nginx.

kyumath commented 1 year ago

@kyumath part of the script is not clear to me like below

b'GET /pub/firefox/releases/latest/win64/en-US/Firefox-Setup.exe/

I setup a vmess with tcp and nginx it seems the proxy only redirects to nginx.

you need to setup tcp+http camouflage with exactly the same path above to detect as xray request.

the client config look like this:

vmess://ew0KICAidiI6ICIyIiwNCiAgInBzIjogIm5naW54IHRlc3QiLA0KICAiYWRkIjogIjEyMi4xOTIuMTExLjE0MCIsDQogICJwb3J0IjogIjgwIiwNCiAgImlkIjogIjg2NWFlNGFkLTdiNTEtNDlmMi1iMWExLTg0ZDlmMTExOWU1ZiIsDQogICJhaWQiOiAiMCIsDQogICJzY3kiOiAiYXV0byIsDQogICJuZXQiOiAidGNwIiwNCiAgInR5cGUiOiAiaHR0cCIsDQogICJob3N0IjogImZ0cC5tb3ppbGxhLm9yZyIsDQogICJwYXRoIjogIi9wdWIvZmlyZWZveC9yZWxlYXNlcy9sYXRlc3Qvd2luNjQvZW4tVVMvRmlyZWZveC1TZXR1cC5leGUvIiwNCiAgInRscyI6ICIiLA0KICAic25pIjogIiIsDQogICJhbHBuIjogImh0dHAvMS4xIg0KfQ==

soon i will update page with some help and usage instruction. we hardly working on that.

Phoenix-999 commented 11 months ago

Hey @legitYosal, FYI: there's an issue we're currently tackling over at https://github.com/XTLS/Xray-core/issues/2778.

We've been working on it for the past three weeks, collaborating with the @gfw-report Team, @irgfw, and several other capable and motivated individuals. Hopefully, we'll get to the bottom of this soon enough