Closed panchr closed 6 years ago
I think we should document it, just not now because the interface may change quickly as we start developing actual datasets that implement it. Specifically, this interface is relatively abstract so we don't know whether it will be suitable until we have something implement it.
Once the courses dataset is implemented, we'll have a better idea of how adequate both the interface (and current resources) are for developing datasets. Afterwards, we can add documentation
The dataset interface defines how the main entrypoint into a dataset should look. Right now, this is in the form of a defined list of tasks, which are functions that are executed on a schedule by Celery.
Generally, the main task will just be a simple function that updates the existing data. However, the notion of general tasks allows for flexibility in the event that different types of data require different periodic maintenance. We allow the dataset itself to declare the schedule at which the tasks should be executed so that varying needs can be met; for example, courses may only need to be updated daily, but the enrollments per-course may need to be updated more frequently.