opendataam / opendatam-tasks

Public tasks for volunteers, hackathons and contests
Creative Commons Zero v1.0 Universal
8 stars 0 forks source link

[EN] Armenian Folklore from USC Digital Folklore Archives #27

Open vvbabayan opened 1 year ago

vvbabayan commented 1 year ago

Goal

The goal is to create a dataset with all Armenian-related subjects in the USC Digital Folklore Archives.

Tasks

You should collect entries from http://folklore.usc.edu website with the author, date, and tags, preferably with categories somehow indicated subheadings. Please, saved collected data in machine-readable formats such as JSON or csv files. Please save documents to any temporary public storage and provide link to transfer it to the permanent storage.

Context

USC Digital Folklore Archives is a database of folklore performances. Armenian-related topics can be found at http://folklore.usc.edu/search_gcse/?q=armenian.

Requirements

Wishes

Please write your code as reusable code that could be launched by someone else later since we could need to update this dataset later.

Resources

Prepared by

This task was prepared by the Open Data Armenia team

MunGell commented 1 year ago

A word of warning: not all articles on this page are related to Armenia (example).

The website uses Google for searching its content and sometimes outputs unrelated articles in the results due to how they were rendered to the Google bot.

vvbabayan commented 1 year ago

@MunGell Thanks for not staying aside! We make sure to validate the generated data, but things like this are always worth to remember about.