WikiEducationFoundation / WikiEduDashboard

Wiki Education Foundation's Wikipedia course dashboard system
https://dashboard.wikiedu.org
MIT License
392 stars 631 forks source link

[Data rearchitecture] Implement `new_article` metric based on article course timeslices #6009

Closed gabina closed 3 weeks ago

gabina commented 3 weeks ago

What this PR does

This PR implements the existing new_article metric using timeslices information instead of revisions table. In order to do that, this PR adds a new new_article field in article_course_timeslices table. This new field is populated when updating article course timeslices caches from revisions in RAM. Then, the existing article course new_article field is populated based on the new_article field for its timeslices.

Screenshots

Before: No new aritcles (or items) After: image

Open questions and concerns

When using revisions table, new_article field is populated as follows.

    # We use the 'all_revisions' scope so that the dashboard system edits that
    # create sandboxes are not excluded, since those are often wind up being the
    # first edit of a mainspace article's revision history
    self.new_article = new_article || # If it's already known to be new, that won't change
                       all_revisions.exists?(new_article: true) || # First edit was by a student
                       # First edit was done automatically by the Dashboard during the course
                       article_revisions.exists?(new_article: true, system: true)

Note 1: it looks like we take into account even revisions with deleted set to true. Is that OK? Note 2: isn't it enough with article_revisions.exists?(new_article: true)?