Open patudom opened 1 month ago
I've filtered these out on the backend side by finding the students that are using the dummy data and adding them to the "ignored students" table.
In case it's useful for anyone (including myself) in the future, I was able to identify the students with this data using the following query:
USE cosmicds_db;
SELECT
*
FROM
(SELECT
student_id,
GROUP_CONCAT(DISTINCT galaxy_id
ORDER BY galaxy_id) AS galaxies,
GROUP_CONCAT(est_dist_value
ORDER BY galaxy_id) AS distances,
GROUP_CONCAT(velocity_value
ORDER BY galaxy_id) AS velocities
FROM
HubbleMeasurements
GROUP BY student_id) AS t
WHERE
galaxies = '227,294,355,1400,1639'
AND
distances IN ("158,59,59,98,185", "158,59,60,98,185")
AND
velocities = "6032,5027,7631,10053,9139";
We need to check which classes include students whose galaxies were selected using the "fill data points" button and filter them out (because that will bias our class ages towards too-high values). (Those students should be filtered out of the ALL STUDENTS data also).
After those students are removed, we should merge small test classes we've created to avoid small number issues.