18F / pulse

How the federal .gov domain space is doing at best practices and policies.
Other
95 stars 56 forks source link

Create a one-time DAP report at the subdomain level #682

Closed gbinal closed 7 years ago

konklone commented 7 years ago

Note that excluding redirects will be more important here - and you may want to also exclude sites which return a non-200 status code altogether.

gbinal commented 7 years ago

Steps:
1) Download the 2016 end of term export (direct download), 12-31-16 censys export (direct download), 12-31-16 Digital Analytics Program export (direct download), and 12-31-16 HTTPS report export (direct download).
2) Combine them all. (93430 results) 3) Remove the inactive domains. (39908 results) 4) Remove the redirecting domains. (34961 results) 5) Dedup the list. (25377 results) 6) Add Agency and Branch columns. (25377 results) 7) Remove other branches. (22053 results) 8) Compare against the DAP participants list. (22053 results)

gbinal commented 7 years ago

Done - https://github.com/18F/g-analytics/tree/18f-pages/projects/dap-subdomain-report