Nimbus scraping
A collection of utilities for scraping data for Cal Poly's Nimbus voice assistant.
Installation
Install pipenv with pip and run
pipenv install
in the project folder.
Usage
import sustainer
# returns a JSON string of CSV data from all modules
json = sustainer.scrape_all()
# For any module "scraping_module",
from scraping_module import ScrapingModule
s = ScrapingModule()
# returns a CSV string of scraped data
csv = s.scrape()
Supported Data
schedules_scraper.py
- Professor contact information (office, phone, email)
- Class section times (start / end)
- Which professor is teaching which class
- Class locations
- Class types (lab / lec)
club_scraper.py
- Contact information (phone, email, box, contact person)
- Advisor info (name, phone, email)
- Affiliations and types
- Descriptions
course_scraper.py
- Course requirements (coreq, prereq, concurrent, and recommended)
- Units
- Department
- Terms typically offered
calendar_scraper.py
- Academic calendar events and dates
faculty_scraper.py
- Faculty contact information (office, phone, email)
- Professor research interests
Architecture
TODO
- [ ] Add reporting when website format changes are detected
- [x] Add GE areas to course_scraper
- [ ] Create degree program scraper (catalog.calpoly.edu/programsaz/)
- [ ] Integrate more calendars into calendar_scraper
- [x] Add logging
Contributing
Feel free to make pull requests or report issues. For major changes, please open an issue first to discuss. Also, check out onboarding.txt (coming soon)
License
MIT