cobalt-uoft / uoft-scrapers

Public web scraping scripts for the University of Toronto.
https://pypi.python.org/pypi/uoftscrapers
MIT License
48 stars 14 forks source link

Events scraper shows `start_date` from 2015 as 2016 #73

Closed qasim closed 8 years ago

qasim commented 8 years ago

Looking at cobalt-uoft/datasets/events.json#L9:

{
  "id":"12052",
  "title":"S.I.S.T.E.R.S. Inaugural Discussion Session on Modesty",
  "start_date":"2016-10-28",
  "end_date":"2016-04-28",
  "start_time":66600,
  "end_time":70200,
  "duration":3600,
  "url":"https://www.facebook.com/events/1482694365393455/",
  "description":"The Multi-Faith Centre for Spiritual Study and Practice and the Centre for Women and Trans People are proud to announce the beginning of a collaborative project called \"S.I.S.T.E.R.S.: Sisters in Spirit Together Engaging in Religious Support\". This collaboration will provide a safe, positive, inclusive and anti-oppressive space for self-identified women to discuss their religious and spiritual experiences. Please note we welcome people from all faiths and spiritual experiences (e.g. Humanists, atheists, and more). S.I.S.T.E.R.S. will meet weekly on Wednesdays from 6:30-7:30pm. The first session will occur on October 28th at 6:30pm. Join us in the Meditation Room in the Multifaith Centre (room 215 in the Koffler House) for free snacks and a fascinating discussion on \"Modesty\". Please note the snacks will be kosher.  If you have any dietary restrictions or accessibility requirements please let us know at multi.faith@utoronto.ca.",
  "admission_price":"Please contact the event contact person.",
  "campus":"UTSG",
  "location":"St. George  Multi-Faith Centre  569 Spadina Avenue",
  "audiences":[
    "Prospective Students",
    "Current Students",
    "Faculty  &  Staff",
    "Alumni/Friends",
    "Undergraduates",
    "Graduate Students",
    "Orientation",
    "First Year Students",
    "Community",
    "Graduating Students"
  ]
}

It shows the start_date as 2016-10-28 when it should be 2015-10-28. Scraper should scrape the year as well when scraping for the date.

kashav commented 8 years ago

Just took a look at the event page (might be a good idea to add an event_url key with this url), it looks like they don't even provide the year.

qasim commented 8 years ago

@kshvmdn interesting. They happen to provide it on the index but not the event page itself:

screen shot 2016-04-28 at 12 20 09 am
qasim commented 8 years ago

Darn, this may be harder than I thought. The scraper grabs only links from the main page (which is reasonable, UofT should really provide the full dates on the event pages...). It would require a sizeable rewriting.

g3wanghc commented 8 years ago

Oops good catch. I just assumed the year to be the current year.