net4people / bbs

Forum for discussing Internet censorship circumvention
3.38k stars 80 forks source link

trustpositif.kominfo.go.id – Indonesia blocklist query tool #401

Open wkrp opened 19 hours ago

wkrp commented 19 hours ago

The site https://trustpositif.kominfo.go.id/ appears to allow you to check whether a domain is on the Indonesian TrustPositif blocklist. However, access to the site is apparently restricted to Indonesian IP addresses, since 2023.

A Wayback Machine archive of 2023-10-07 has the text:

Isilah Domain/URL/Keyword yang ingin Anda cari pada kolom isian di bawah, cukup 1 bagian kata saja, misalkan: ‘Domain’. Kemudian klik ‘CARI DATA’ untuk melakukan pencarian. Anda tidak perlu menyertakan ‘http://’ pada awal kata pencarian ataupun trailing slash ‘/’ pada akhir kata pencarian.

[Cari data pemblokiran trustpositif]

Fill in the Domain/URL/Keyword that you want to search in the field below, just 1 part of the word, for example: ‘Domain’. Then click ‘SEARCH DATA’ to search. You do not need to include ‘http://’ at the beginning of the search word or trailing slash ‘/’ at the end of the search word.

[Search trustpositif blocking data]

I found about this query tool from an issue at the Tor bug tracker about the blocking of Tor relay IP addresses in Indonesia.

Volunteers on OONI slack reported that some Tor relays in Indonesia were blocked by Kominfo in September 2024.

How to check

"To test, you have to use Indonesian IP because Kominfo restricted it to non Indonesian IP in 2023. There are currently 2 Tor relays that got blocked by Indonesian government as of September 22 2024."

https://trustpositif.kominfo.go.id

wkrp commented 7 hours ago

At the same 2023-10-07 Wayback Machine archive, I followed the "Download Blacklist TrustPositif" link (https://trustpositif.kominfo.go.id/assets/db/domains) and found an archive of that file too. Here's a compressed copy:

trustpositif.kominfo.go.id-domains-20230921193408.gz

It's a text file with 2,031,242 lines. (Compare to https://github.com/net4people/bbs/issues/316#issuecomment-1859043366: "This slide claims 2,501,070 domains and subdomains were blocked as of 2023-12-01.")

Each line of the file has a domain name. Judging by the looks of things, most of them are porn sites. The leftmost components of each domain string is censored with 4, 7, or 10 * characters:

p**********nisindonesia.wordpress.com
r****ondibrahim.com
b****idansaksi.com
m**********firun.forumotion.net
m****anmuslim.com
i**********neinstitute.org
i****vestama.com
i****ackmarket.com
j**********ckmarket.com

Taking this censoring into consideration, there are 1,667,555 distinct lines in the file. Some of the duplicates would likely become distinct if the characters under the **** were to be revealed, for example:

a**********00.blogspot.com
a**********00.blogspot.com
a**********00.blogspot.com

The Wayback Machine has other versions of the "domains" file: https://web.archive.org/web/20230921000000*/https://trustpositif.kominfo.go.id/assets/db/domains

It would make a good FOCI short paper, for example, to analyze the historical version of this file, and set up periodic monitoring to track changes in it. It's also worth checking if there's anything else of interest under https://trustpositif.kominfo.go.id/assets/.