anudeepND / blacklist

Curated and well-maintained hostfile to block ads, tracking, cryptomining, and more! Updated regularly. ⚡🔒
https://hosts.anudeep.me/mirror/adservers.txt
MIT License
1.09k stars 112 forks source link

Website Broken-timesofindia.indiatimes.com #181

Closed blackice19 closed 2 years ago

blackice19 commented 2 years ago
logo

Issue Submit Form

Provide the following info properly, which will help me to resolve your issue quickly.

Issue(s):

Type x in between [ ] and make sure there isn't any space between brackets. Example; for Your Selected Issue(s), type like this - [x] You can select more than one category of issues if you need to!


Domain(s):

If you are submitting this issue for whitelist/blacklist issue, send me the domain(s) for whitelisting/blacklisting here. Kindly use the Code Tag to prevent tracking.


Details:

You can attach any screen shot or log of the issue or advert, this will help to highlight it.

When opening timesofindia.indiatimes.com website (tried on safari & firefox), it goes to the initial landing page, but the website does not load any further after that (see the attached pdf). Some of the domains pertaining to indiatimes are blocked. I unblocked the first three, but I suspect the ibeat.xxx subdomain may be main culprit, as the .js for it is not loaded.However, this domain may be trackware (?).

TOI.pdf

spirillen commented 2 years ago

I have no issues and no calls to ibeat.xxx. According to your screenshot, it could be that you have not accepted there spyware....

indiatimes.com##.consent-popup
indiatimes.com##+js(addEventListener-defuser, copy)

May |I suggest you in addition to this hosts file also install uBlock Origin to help protecting your self?

image

image

blackice19 commented 2 years ago

Thank you for responding to my request.

I had tried to put in the subdomains of initial interest, but it did not go in. I did some further digging and am sharing some results I found:-

  1. With the default setting of the filter, the website did not open, both with and without clicking of the consent pop-up. So i started via trial and error, to identify the subdomains it was calling, to see which domain blockage was causing the issue.

  2. I initially identified 4 subdomains (agi-static.indiatimes.com, jssocdn.indiatimes.com,jsso.indiatimes.com,api.ibeat-analytics.com) as potential candidates. the screenshot I sent was from the session after I removed blocks for the first 3 domains. The implementation is at the DNS level & information on blocking is available to me as a log. I am attaching the dns log when trying to open the site, from that run (Current_Config.pdf).

  3. My domain identification was wrong. You are correct in saying blocking ibeat.xx is not the cause. It turned out to be geoapi.indiatimes.com. This has been an old Achilles heel if I remember correctly. Allowing this domain solved the issue.there was no need to allow the previous 4 domains.

  4. I understand from where you are coming from vis-a-vis implementing it on something like ublock. However, I am enforcing DNS level filtering and there are additional benefits of using it especially at systems level, so i do not wish to slow down my browsers with further additional add-ons. (no ublock for Safari!)

  5. i experimented using ublock on FF.I blocked the geoapi.indiatimes.com there, and lo and presto, it didn't load.(FF_ublock_geoapi.pdf)

  6. It looks to me that the geoapi thing is a .js implementation of TOI. I agree it needs to be blocked, but as it is breaking the website, a higher level blocking (firewall) may be required instead. I also did some digging on it and think it is an alias! (nslook.png)

Nevertheless, your filter list is very effective in combating trackware and is part of my composite filter list i have compiled for DNS level filtering. Thank you!! Current_Config.pdf r FF_ublock_geoapi.pdf

nslookup
spirillen commented 2 years ago

Just a little "setting things straight"

  1. I understand from where you are coming from vis-a-vis implementing it on something like ublock. However, I am enforcing DNS level filtering and there are additional benefits of using it especially at systems level, so i do not wish to slow down my browsers with further additional add-ons. (no ublock for Safari!)

My heart is in the DNS RPZ firewall not in ublock, uBlock is a append to this as you can do things more fine grained there, non the less I do fully agree with you in using it as a firewall.

  1. I initially identified 4 subdomains (agi-static.indiatimes.com, jssocdn.indiatimes.com,jsso.indiatimes.com,api.ibeat-analytics.com) as potential candidates

You can see what was blacklisted at my end here: https://mypdns.org/my-privacy-dns/matrix/-/issues/3592 This include blacklisting of geoapi.indiatimes.com CNAME . ; Tracking

drill geoapi.indiatimes.com
;; ->>HEADER<<- opcode: QUERY, rcode: NXDOMAIN, id: 2578
;; flags: qr rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; geoapi.indiatimes.com.       IN      A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 1 msec

For your firewall response table I noticed you using 'REJECT' vs 'DROP'. For blacklisted items you should always use the 'DROP' statement as lot of the spy(track)ware is programmed to use other gateways of it receives a reject

Conclusion

It seems like there are still a lot of issues related particular to geoapi.indiatimes.com tent to be a thorn, I starting to think this is because they do another location lookup via the IP address, AND IF your IP is registered to country X then they requires this sub-domain to be loaded to do something (legal) or bad desires, who knows. https://mypdns.org/my-privacy-dns/matrix/-/issues/3592#note-geoapiindiatimescom

I leave it to @anudeepND as 1. it is his list. 2. I would personally leave it as so in my own project. 3. make a recommendation to the individual users to whitelist in case it brakes things for them.

anudeepND commented 2 years ago

@blackice19 Thanks for reporting, TOI is broken due to:

jssocdn.indiatimes.com
jsso.indiatimes.com
geoapi.indiatimes.com

https://jsso.indiatimes.com/sso/identity/login?channel=et is a login page and should not be blocked. geoapi.indiatimes.com is some kind of GeoIP detection I guess. Unblocking these 3 domains will solve the issue. @blackice19 can you please confirm?

However, while investigating I was able to detect 2 new trackers:

https://agi-static.indiatimes.com/cms-common/ibeat.min.js 
https://ingestion.contentinsights.com (which is owned by https://smartocto.com/)
blackice19 commented 2 years ago

@blackice19 Thanks for reporting, TOI is broken due to:

jssocdn.indiatimes.com
jsso.indiatimes.com
geoapi.indiatimes.com

https://jsso.indiatimes.com/sso/identity/login?channel=et is a login page and should not be blocked. geoapi.indiatimes.com is some kind of GeoIP detection I guess. Unblocking these 3 domains will solve the issue. @blackice19 can you please confirm?

---> For the jsso subdomains, I am not sure. It could well be the case. I do not log in to comment. So i have kept them blocked in my filter list and site loads. If this is the case, then it needs to be unblocked for users who log in. A note would be fine to make them aware, in my opinion.

However, geoapi.indiatimes.com is a show stopper. If blocked, either at firewall level or at the browser level, the website does not load the homepage. I recommend this to be whitelisted then.

Note:- geoapi.indiatimes.com reads IP to determine location. what the html script returns is : { "CountryCode":"xx","region_code":"xxx","city":"xxx", "Continent":"xxx" }. My understanding is that this may be used to configure the home page, such as for US customer. However, this is an alias and is hosted in the US.

Whatever little I understand of html code used in timesindia.com, the subdomains i have found so far that are called by it are implement both in scripts and links. (please see the 2 attachments in pdf). I block the urls and IPs at DNS level for system wide blocking. The scripts can be blocked via ublock for compatible browsers.

However, while investigating I was able to detect 2 new trackers:


https://agi-static.indiatimes.com/cms-common/ibeat.min.js 

-->agi-static.indiatimes.com was blocked previously and TOI website loads with this blocked. This is java script and could be blocked via ublock. I have blocked the domain in my DNS level firewall, so the domain is never allowed in on any browser or TOI app (have seen this domain called by the app as well sometimes on iOS )

https://ingestion.contentinsights.com (which is owned by https://smartocto.com/) --> i could not find this on TOI site but domain is already blocked in my filter list.


[ToI01.pdf](https://github.com/anudeepND/blacklist/files/7740444/ToI01.pdf)
[ToI02.pdf](https://github.com/anudeepND/blacklist/files/7740445/ToI02.pdf)
anudeepND commented 2 years ago

Fixed in 56bee9d7ca1fdc3e01078b00ea74945595967486