Open cameron-toy opened 4 years ago
Had a conversation with the data team yesterday and we will be storing the data in the database through SQLAlchemy object mapper classes. We ran through an example for storing an AudioSampleMetaData
object which can be seen here: https://github.com/calpoly-csai/api/pull/35
Our use case is very similar to what was done in this PR, so I imagine we will be building JSON representations of each scraped object (Course
, Club
, ...) so it can be mapped to its respective SQLAlchemy entity (Courses
source code).
The solution to logging posed in this issue would be great for integrating with how we are going to store the data. We could build the wrapped object as follows, then save the List[Course]
in bulk through one API call:
CoursesData {
errors: List[Error]
timestamp: Timestamp
data: List[Course]
}
Currently, each module outputs a csv string with just the data. Making that data one field in a JSON string with errors, timestamps, and other metadata in the others would allow for better logging and error handling.