appsembler / figures

Reporting and data retrieval app for Open edX
MIT License
44 stars 37 forks source link

Enable daily metrics models backfill of historical data in platform, but not progress #452

Closed johnbaldwin closed 2 years ago

johnbaldwin commented 2 years ago

https://appsembler.atlassian.net/browse/BLACK-2375

Porting ad-hoc scripts into Figures to fill approximate metrics for historical values

There should be one more commit, modify the existing Django management command, backfill_figures_daily_metrics and add test coverage for that command

Commit 8aeced479a0949eef179c24ed17b5db2707e8d46

Add method - figures.course.Course.first_enrollment_timestamp

This exists to help identify when we should include a course in performing backfill to CourseDailyMetrics for a given date in the past.

Getting the CourseEnrollment.created value for the first enrollment provides a fast "close enough best guess" when a course was first created, as we cannot rely on CourseOverview timestamps and this avoids diving into MongoDB or the filesystem to dredge up timestamps for the given course

Commit 7146aa49153ee7ecf69be1eba09ea6246ba296f1

Commit message has the important details, but here's a brief summary:

This one fixes a latent bug in an existing test case that appeared because the test case did not explicitly set course enrollment 'created' dates for the CourseEnrollmentFactory instantiations. After we added new CourseEnrollmentFactory created instances, the created date got bumped past the date check.

Commit 8aeced479a0949eef179c24ed17b5db2707e8d46

New functions to call backfill for a site and date and track the backfill to a log file

Commit 0fa88808f6e166f10016a61959aaf66270898cc4

Add new test module, remove test class from old test module

johnbaldwin commented 2 years ago

Interesting. Apparently the new test, test_first_enrollment_timestamp_with_enrollments has a side effect causing the following test failure in a different test module, and only when at least both test modules are executed.

The new code in this PR appears to cause the following to fail in multisite environments.

tests/pipeline/test_course_daily_metrics.py ......F............s

 =================================== FAILURES ===================================
___ TestCourseDailyMetricsPipelineFunctions.test_get_num_learners_completed ____

self = <tests.pipeline.test_course_daily_metrics.TestCourseDailyMetricsPipelineFunctions object at 0x7f443a6d7e90>

    def test_get_num_learners_completed(self):
        actual = pipeline_cdm.get_num_learners_completed(
            course_id=self.course_overview.id,
            date_for=self.today)
>       assert actual == len(self.generated_certificates)
E       assert 2 == 3
johnbaldwin commented 2 years ago

@OmarIthawi @melvinsoft @shadinaif @estherjsuh @bryanlandia This is ready for review

johnbaldwin commented 2 years ago

@OmarIthawi @melvinsoft @shadinaif @estherjsuh @bryanlandia Following up, would someone please review?

johnbaldwin commented 2 years ago

Thanks for reviewing @bryanlandia!