icssc / peterportal-api-next

API that provides easy access to public data from UC Irvine. Developed for Anteaters, by Anteaters.
https://docs.icssc.club/anteaterapi
MIT License
6 stars 0 forks source link

feat: implement WebSoc scraping and RDS-backed endpoint #21

Closed ecxyzzy closed 1 year ago

ecxyzzy commented 1 year ago

Summary

Switch to using our Amazon RDS (managed relational database) instance for the WebSoc cache. This will allow us to cache all of WebSoc and serve arbitrary requests with it, rather than the limited set of queries we currently support.

Special thanks to @MinhxNguyen7 for helping with brainstorming and testing for this feature.

TODO:

Issues

Closes #11.

Future Followup

The scraper can serve as a basis for getting enrollment data for that endpoint.

ecxyzzy commented 1 year ago

@bevm0 Blazingly fast :rocket: image

Filtering seems to be mildly broken still and I still need to actually combine the scraping and processing logic into a single coherent script and I need to find some way to deploy this...

MinhxNguyen7 commented 1 year ago

How much faster is that?

ecxyzzy commented 1 year ago

@MinhxNguyen7 this is approximately on par with the DynamoDB-backed partial cache (and anywhere from 2-10× faster than querying WebSoc directly), with and without considering Lambda cold start times. However, we can now cache everything, which should lower the expected response time overall. IMO this is strictly an improvement over the old solution.

ecxyzzy commented 1 year ago

Scraper is up and running, and the cache should be fully repopulated in around 30 minutes.