outbreak-info / outbreak.info-resources

A curated repository of metadata of resources on COVID-19 and SARS-CoV-2
MIT License
0 stars 4 forks source link

[PUBLICATION, ANALYSIS] Write Imperial College Parser #98

Closed flaneuse closed 3 years ago

flaneuse commented 4 years ago
  1. Crawl Imperial College website to get all COVID-related resources.
  2. Classify each resource as a @type: Publication, Dataset, Analysis, Protocol, SoftwareSourceCode, ImageObject, MediaObject
Imperial Category url outbreak @type
Reports http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-reports/ Publication
Publications http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-publications/ DO NOT ADD (covered in "Reports")
Planning Tools http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-planning-tools/ Analysis
"Code & Data" http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-scientific-resources/ Dataset, SoftwareSourceCode (2 entries)
" Data" http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-scientific-resources/ Dataset
"Population survey" http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-scientific-resources/ Protocol
Video updates http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-video-updates/ DO NOT ADD
  1. Map each property to outbreak schema

  2. tag with curatedBy:

    curatedBy: {'@type': 'Organization',
    'identifier': 'imperialcollege',
    'url': 'http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-23-united-states/',  // link to original metadata page on site
    'name': 'Imperial College London',
    'updatedDate': <date accessed, in YYYY-MM-DD>}
  3. If possible, add a topicCategory for each resource

  4. Deploy to Biothings Hub to add to api.outbreak.info/resources

gtsueng commented 4 years ago

https://github.com/gtsueng/covid_imperial_college