commonsearch / cosr-participation

Common Search sub-project to make contributing easy and help people get involved.
7 stars 0 forks source link

Need a web-annotation framework #12

Open Sentimentron opened 8 years ago

Sentimentron commented 8 years ago

Looking through lots of the issues in cosr-back, lots of them require some degree of automation, but it's difficult to tell if what's implemented will work on real-life pages. I propose creating some kind of GUI application that reads WARC files from the common crawl and allows us to mark them up, with the results heading to a common repository. Does anyone know of something that does this already?

sylvinus commented 8 years ago

Can you give some examples of usecases for this? What kind of annotations would we make?

Sentimentron commented 8 years ago

Things like publication dates, authors and advertising would be the primary candidates.

On 7 April 2016 at 19:14, Sylvain Zimmer notifications@github.com wrote:

Can you give some examples of usecases for this? What kind of annotations would we make?

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/commonsearch/cosr-participation/issues/12#issuecomment-207033712

sylvinus commented 8 years ago

Oh ok understood. So this would be things like CSS selectors that would help us parse the pages?