cobalt-uoft / uoft-scrapers

Public web scraping scripts for the University of Toronto.
https://pypi.python.org/pypi/uoftscrapers
MIT License
49 stars 14 forks source link
open-data toronto uoft web-scraper

UofT Scrapers PyPI version

This is a library of scrapers for various University of Toronto websites.

Table of Contents

Requirements

Installation

pip install uoftscrapers

Usage

import uoftscrapers

# Example: scrape http://map.utoronto.ca building data to ./some/path
uoftscrapers.Buildings.scrape('./some/path')

# Example: scrape http://coursefinder.utoronto.ca to current working directory
uoftscrapers.Courses.scrape()

Library Reference

Courses

Class name
uoftscrapers.Courses
Scraper source

http://coursefinder.utoronto.ca

Output format

{
  "id": String,
  "code": String,
  "name": String,
  "description": String,
  "division": String,
  "department": String,
  "prerequisites": String,
  "exclusions": String,
  "level": Number,
  "campus": String,
  "term": String,
  "breadths": [Number],
  "meeting_sections": [{
    "code": String,
    "instructors": [String],
    "times": [{
      "day": String,
      "start": Number,
      "end": Number,
      "duration": Number,
      "location": String
    }],
    "size": Number,
    "enrolment": Number
  }]
}

Buildings

Class name
uoftscrapers.Buildings
Scraper source

http://map.utoronto.ca

Output format
{
  "id": String,
  "code": String,
  "name": String,
  "short_name": String,
  "campus": String,
  "address": {
    "street": String,
    "city": String,
    "province": String,
    "country": String,
    "postal": String
  },
  "lat": Number,
  "lng": Number,
  "polygon": [
    [Number, Number]
  ]
}

Textbooks

Class name
uoftscrapers.Textbooks
Scraper source

http://uoftbookstore.com

Output format
{
  "id": String,
  "isbn": String,
  "title": String,
  "edition": Number,
  "author": String,
  "image": String,
  "price": Number,
  "url": String,
  "courses":[{
    "id": String,
    "code": String,
    "requirement": String,
    "meeting_sections":[{
      "code": String,
      "instructors": [String]
    }]
  }]
}

Food

Class name
uoftscrapers.Food
Scraper source

http://map.utoronto.ca

Output format
{
  "id": String,
  "building_id": String,
  "name": String,
  "short_name": String,
  "description": String,
  "url": String,
  "tags": [String],
  "image": String,
  "campus": String,
  "lat": Number,
  "lng": Number,
  "address": String,
  "hours": {
    "sunday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    },
    "monday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    }
    "tuesday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    },
    "wednesday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    },
    "thursday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    },
    "friday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    },
    "saturday": {
      "closed": Boolean,
      "open": Number,
      "close": Number
    }
  }
}

Calendar

Class name
uoftscrapers.Calendar
Scraper source
Output format

Not implemented.


UTSG Calendar

Class name
uoftscrapers.UTSGCalendar
Scraper source

http://www.artsandscience.utoronto.ca/ofr/calendar/

Output format

Refer to Calendar


UTM Calendar

Class name
uoftscrapers.UTMCalendar
Scraper source

https://student.utm.utoronto.ca/calendar/calendar.pl

Output format

Refer to Calendar


UTSC Calendar

Class name
uoftscrapers.UTSCCalendar
Scraper source

http://www.utsc.utoronto.ca/~registrar/calendars/calendar/index.html

Output format

Refer to Calendar


Timetable

Class name
uoftscrapers.Timetable
Scraper source
Output format
{
  "id": String,
  "code": String,
  "name": String,
  "description": String,
  "division": String,
  "department": String,
  "prerequisites": String,
  "exclusions": String,
  "level": Number,
  "campus": String,
  "term": String,
  "breadths": [Number],
  "meeting_sections": [{
    "code": String,
    "instructors": [String],
    "times": [{
      "day": String,
      "start": Number,
      "end": Number,
      "duration": Number,
      "location": String
    }],
    "size": Number,
    "enrolment": Number
  }]
}

UTSG Timetable

Class name
uoftscrapers.UTSGTimetable
Scraper source

https://timetable.iit.artsci.utoronto.ca

Output format

Refer to Timetable


UTM Timetable

Class name
uoftscrapers.UTMTimetable
Scraper source

https://student.utm.utoronto.ca/timetable

Output format

Refer to Timetable


UTSC Timetable

Class name
uoftscrapers.UTSCTimetable
Scraper source

http://www.utsc.utoronto.ca/~registrar/scheduling/timetable

Output format

Refer to Timetable


Exams

Class name
uoftscrapers.Exams
Scraper source
Output format
{
  "id": String,
  "course_id": String,
  "course_code": String
  "period": String,
  "date": String,
  "start_time": Number,
  "end_time": Number,
  "duration": Number,
  "sections": [{
    "lecture_code": String,
    "exam_section": String,
    "location": String
  }]
}

UTSG Exams

Class name
uoftscrapers.UTSGExams
Scraper source

http://www.artsci.utoronto.ca/current/exams

Output format

Refer to Exams


UTM Exams

Class name
uoftscrapers.UTMExams
Scraper source

https://student.utm.utoronto.ca/examschedule/finalexams.php

Output format

Refer to Exams


UTSC Exams

Class name
uoftscrapers.UTSCExams
Scraper source

http://www.utsc.utoronto.ca/registrar/examination-schedule

Output format

Refer to Exams


Athletics

Class name
uoftscrapers.Athletics
Scraper source
Output format
{
  "date": String,
  "events":[{
    "title": String,
    "campus": String,
    "location": String,
    "building_id": String,
    "start_time": Number,
    "end_time": Number,
    "duration": Number
  }]
}

UTSG Athletics

Class name
uoftscrapers.UTSGAthletics
Scraper source

Not yet implemented

Output format

Refer to Athletics


UTM Athletics

Class name
uoftscrapers.UTMAthletics
Scraper source

http://www.utm.utoronto.ca/athletics/schedule/month/

Output format

Refer to Athletics


UTSC Athletics

Class name
uoftscrapers.UTSCAthletics
Scraper source

http://www.utsc.utoronto.ca/athletics/calendar-node-field-date-time/month/

Output format

Refer to Athletics


Parking

Class name
uoftscrapers.Parking
Scraper source

http://map.utoronto.ca

Output format
{
  "id": String,
  "title": String,
  "building_id": String,
  "campus": String,
  "type": String,
  "description": String,
  "lat": Number,
  "lng": Number,
  "address": String
}

Shuttles

Class name
uoftscrapers.Shuttles
Scraper source

https://m.utm.utoronto.ca/shuttle.php

Output format
{
  "date": String,
  "routes": [{
    "id": String,
    "name": String,
    "stops": [{
      "location": String,
      "building_id": String,
      "times": [{
        "time": Number,
        "rush_hour": Boolean,
        "no_overload": Boolean
      }]
    }]
  }]
}

Events

Class name
uoftscrapers.Events
Scraper source

https://www.events.utoronto.ca/

Output format
{
  id: String,
  title: String,
  start_date: String
  end_date: String,
  start_time: Number,
  end_time: Number,
  duration: Number,
  url: String,
  description: String,
  admission_price: String,
  campus: String,
  location: String,
  audiences: [String],
}

Libraries

Class name
uoftscrapers.Libraries
Scraper source

https://onesearch.library.utoronto.ca/

Output format
{
  id: String,
  name: String,
  image: String,
  website: String,
  address: String,
  phone: String,
  about: String,
  collection_strengths: String,
  access: String,
  hours: {
    sunday: {
      closed: Boolean,
      open: String,
      close: String,
    },
    monday: {
      closed: Boolean,
      open: Number,
      close: Number,
    },
    tuesday: {
      closed: Boolean,
      open: Number,
      close: Number,
    },
    wednesday: {
      closed: Boolean,
      open: Number,
      close: Number,
    },
    thursday: {
      closed: Boolean,
      open: Number,
      close: Number,
    },
    friday: {
      closed: Boolean,
      open: Number,
      close: Number,
    },
    saturday: {
      closed: Boolean,
      open: Number,
      close: Number,
    }
  }
}

Dates

Class name
uoftscrapers.Dates
Scraper source
Output format
{
  "date": String,
  "events": [{
    "end_date": String,
    "session": String,
    "campus": String,
    "description": String
  }]
}

UTSG Dates

Class name
uoftscrapers.UTSGDates
Scraper source

http://www.artsci.utoronto.ca/current/course/timetable/ http://www.undergrad.engineering.utoronto.ca/About/Dates_Deadlines.htm

Output format

Refer to Exams


UTM Dates

Class name
uoftscrapers.UTMDates
Scraper source

http://m.utm.utoronto.ca/importantDates.php

Output format

Refer to Exams