Webdevdata / webdevdata.org

Website for reports, etc.
44 stars 7 forks source link

Collect more deep pages, not just front pages #11

Open zcorpan opened 11 years ago

zcorpan commented 11 years ago

Having the data be biased towards only front pages makes the data less useful than it could be. Front pages are often different compared to deep pages.

I realize this would balloon the size of the data. But it has been done before (e.g. http://dotnetdotcom.org/ )

stevefaulkner commented 11 years ago

we have Yoav's script now, it just takes someone to do the work to gather the data and make it available

marcoscaceres commented 11 years ago

I think if we are going to go to this level, we would need to move to having actual infrastructure to support this (i.e., we should talk to the W3C about getting money, providing hosting, etc.).

marcoscaceres commented 11 years ago

Proposed it to the W3C: http://lists.w3.org/Archives/Public/public-closingthegap/2013May/0050.html

yoavweiss commented 11 years ago

Awesome!!! On May 17, 2013 11:23 AM, "Marcos Caceres" notifications@github.com wrote:

Proposed it to the W3C: http://lists.w3.org/Archives/Public/public-closingthegap/2013May/0050.html

— Reply to this email directly or view it on GitHubhttps://github.com/Webdevdata/webdevdata.org/issues/11#issuecomment-18051652 .