web-archive-group / WALK

Web Archives for Longitudinal Knowledge
8 stars 2 forks source link

Create dummy collections for testing #44

Closed greebie closed 8 years ago

greebie commented 8 years ago

Using the compare tool, will create a dummy based on the websites that are present in a large number of sites (the size will depend on the number of collections - so far with 38 collections, it looks like urls with 16+ cross-links will produce about 100 or so website urls.

These urls will content-analysized for the following categories

With the dummy collection, that's 6 dummy categories.

ianmilligan1 commented 8 years ago

I think we've developed this – if so, @greebie do you mind closing the issue (or providing an update if you want to keep it open!).