cosmicds / hubbleds

Hubble's law data story
MIT License
0 stars 8 forks source link

Filter All Classes data #546

Open patudom opened 1 month ago

patudom commented 1 month ago

We need to check which classes include students whose galaxies were selected using the "fill data points" button and filter them out (because that will bias our class ages towards too-high values). (Those students should be filtered out of the ALL STUDENTS data also).

After those students are removed, we should merge small test classes we've created to avoid small number issues.

Carifio24 commented 1 month ago

I've filtered these out on the backend side by finding the students that are using the dummy data and adding them to the "ignored students" table.

In case it's useful for anyone (including myself) in the future, I was able to identify the students with this data using the following query:

USE cosmicds_db;
SELECT 
    *
FROM
    (SELECT 
        student_id,
            GROUP_CONCAT(DISTINCT galaxy_id
                ORDER BY galaxy_id) AS galaxies,
            GROUP_CONCAT(est_dist_value
                ORDER BY galaxy_id) AS distances,
            GROUP_CONCAT(velocity_value
                ORDER BY galaxy_id) AS velocities
    FROM
        HubbleMeasurements
    GROUP BY student_id) AS t
WHERE
    galaxies = '227,294,355,1400,1639'
    AND
    distances IN ("158,59,59,98,185", "158,59,60,98,185")
    AND
    velocities = "6032,5027,7631,10053,9139";