andersju / webbkoll

An online tool that checks how a website is doing with regards to privacy
MIT License
266 stars 28 forks source link

Server address behind CDN not revealed. #22

Open nihilus opened 4 years ago

nihilus commented 4 years ago

I.E cloudflare inject stuffs before doing the redirection. So one needs to look at all levels of redirects (i.e. HTTP 301) before drawing any conclusions of the location.

andersju commented 4 years ago

Currently, for the estimated server location, the scanner get the IP address of the "main resource response" returned by Puppeteer's page.goto(): "In case of multiple redirects, the navigation will resolve with the response of the last redirect."

Maybe list the IP address + estimated location of each redirect?

(Somehow this also reminds me that we should also try to deal with CNAME cloaking...)

nihilus commented 4 years ago

Hmm, the scanner is not even wrong in this case: 'Serverplats: USA — 104.24.126.87' for a domain which is hosted by Loopia. However it might be that Cloudflare is proying the requests which makes it virtually impossible to determine the final location.

nihilus commented 4 years ago

Now I see I get an 'x-loopia-node' header with the actual IP.

andersju commented 4 years ago

Hmm. Two things I might try then:

1) Look for any headers containing IP addresses and show them separately

2) Detect use of at least Cloudflare, and maybe similar services, and have something about that under "Server location". (Possible things to check: HTTP headers (including cookies), IP block owner, TLS cert info, ..?)

nihilus commented 4 years ago

@andersju Yes, detecting the use of a CDN provider would be neat. Easy to check if the NS-servers are in *.ns.cloudflare.com for Cloudflare.

Beside that I find the webkoll.dataskydd.net working as expected (some slight layout issues with Firefox's Dark Reader extension albeit).