Using the compare tool, will create a dummy based on the websites that are present in a large number of sites (the size will depend on the number of collections - so far with 38 collections, it looks like urls with 16+ cross-links will produce about 100 or so website urls.
These urls will content-analysized for the following categories
Government (eg. gc.ca)
Social media (twitter.com)
Technology sites (adobe.com, bit.ly)
Organizations (w3.org)
Media (wsj.com)
With the dummy collection, that's 6 dummy categories.
Using the compare tool, will create a dummy based on the websites that are present in a large number of sites (the size will depend on the number of collections - so far with 38 collections, it looks like urls with 16+ cross-links will produce about 100 or so website urls.
These urls will content-analysized for the following categories
With the dummy collection, that's 6 dummy categories.