openedx / openedx-aspects

Aspects - Analytics for Open edX
Apache License 2.0
6 stars 6 forks source link

Query updates for performance #215

Open bmtcril opened 2 months ago

bmtcril commented 2 months ago

Run 1

Superset Reports: aspects-0.96.4-5ad897

1. Problem engagement by section/subsection
Superset time: 141.447137 (s).
2. Video engagement by section/subsection
Superset time: 97.140689 (s).
3. Distribution Of Attempts
Superset time: 90.075801 (s).
4. Distribution Of Hints Per Correct Answer
Superset time: 89.046454 (s).
5. Responses Per Problem
Superset time: 88.972371 (s).
6. Distribution Of Responses
Superset time: 50.679826 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 2
Filters: [{'column': 'emission_time'}]

7. Watched Video Segments
Superset time: 47.319447 (s).
8. Pageview engagement by section/subsection
Superset time: 46.930388 (s).
9. Partial and full views per video
Superset time: 45.919465 (s).
10. Distribution Of Problem Grades
Superset time: 39.189939 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

11. Currently Enrolled Learners Per Day
Superset time: 20.703394 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1560
Filters: [{'column': 'enrollment_status'}, {'column': 'enrollment_status_date'}]

12. Watches Per Video
Superset time: 18.602801 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

13. Course Grade Distribution
Superset time: 10.497829 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

14. Enrollments By Enrollment Mode
Superset time: 2.516631 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 3
Filters: [{'column': 'enrollment_status'}, {'column': 'emission_time'}]

15. Enrollment Events Per Day
Superset time: 1.440251 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1549
Filters: [{'column': 'emission_time'}]

16. Transcripts / Captions Per Video
Superset time: 1.042287 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

17. Posts per user
Superset time: 0.747924 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 58
Filters: []

18. Distinct forum users
Superset time: 0.44716 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'emission_time'}]

Run 2

Superset Reports: aspects-0.96.4-264afc

1. Problem engagement by section/subsection
Superset time: 140.097964 (s).
2. Video engagement by section/subsection
Superset time: 115.120385 (s).
3. Distribution Of Hints Per Correct Answer
Superset time: 90.338544 (s).
4. Distribution Of Attempts
Superset time: 89.500845 (s).
5. Responses Per Problem
Superset time: 89.127751 (s).
6. Distribution Of Responses
Superset time: 50.785438 (s).
Succeeded: True
Query duration: 50.269 (s).
Result rows: 2
Memory Usage (MB): 4289.866982460022
Row count (superset) 2
Filters: [{'column': 'emission_time'}]

7. Pageview engagement by section/subsection
Superset time: 47.334997 (s).
8. Watched Video Segments
Superset time: 45.728415 (s).
9. Partial and full views per video
Superset time: 45.415808 (s).
10. Distribution Of Problem Grades
Superset time: 38.675151 (s).
Succeeded: True
Query duration: 38.196 (s).
Result rows: 10
Memory Usage (MB): 4131.521450996399
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

11. Currently Enrolled Learners Per Day
Superset time: 19.75312 (s).
Succeeded: True
Query duration: 19.087 (s).
Result rows: 1560
Memory Usage (MB): 3465.305534362793
Row count (superset) 1560
Filters: [{'column': 'enrollment_status'}, {'column': 'enrollment_status_date'}]

12. Watches Per Video
Superset time: 18.122886 (s).
Succeeded: True
Query duration: 17.602 (s).
Result rows: 10000
Memory Usage (MB): 2978.9347105026245
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

13. Course Grade Distribution
Superset time: 9.349726 (s).
Succeeded: True
Query duration: 9.021 (s).
Result rows: 10
Memory Usage (MB): 1034.8061332702637
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

14. Enrollment Events Per Day
Superset time: 1.588779 (s).
Succeeded: True
Query duration: 1.07 (s).
Result rows: 3094
Memory Usage (MB): 226.0701723098755
Row count (superset) 1549
Filters: [{'column': 'emission_time'}]

15. Enrollments By Enrollment Mode
Superset time: 1.486591 (s).
Succeeded: True
Query duration: 0.747 (s).
Result rows: 3
Memory Usage (MB): 248.0316982269287
Row count (superset) 3
Filters: [{'column': 'enrollment_status'}, {'column': 'emission_time'}]

16. Transcripts / Captions Per Video
Superset time: 1.153137 (s).
Succeeded: True
Query duration: 0.521 (s).
Result rows: 10000
Memory Usage (MB): 141.44000148773193
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

17. Posts per user
Superset time: 0.805276 (s).
Succeeded: True
Query duration: 0.432 (s).
Result rows: 58
Memory Usage (MB): 395.145396232605
Row count (superset) 58
Filters: []

18. Distinct forum users
Superset time: 0.557155 (s).
Succeeded: True
Query duration: 0.211 (s).
Result rows: 1
Memory Usage (MB): 158.88139629364014
Row count (superset) 1
Filters: [{'column': 'emission_time'}]
bmtcril commented 2 months ago

Full log including queries and errors is attached here. perf_runs.log

bmtcril commented 2 months ago

Additional information:

Using the updated query performance script that takes a course id and filters to that course (the way these dashboards are intended to be used) we've got better numbers (using a "huge" load test course on the same dataset):

1. Problem Interactions
Superset time: 187.613377 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 160
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

2. Problem Results
Superset time: 187.319167 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 160
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

3. Learners with Passing Grade
Superset time: 1.707321 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'course_key'}]

4. Cumulative Interactions
Superset time: 1.349912 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

5. Evolution of engagement
Superset time: 1.24138 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

6. Course Information
Superset time: 1.218131 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'course_key'}]

7. Page views per section/subsection
Superset time: 0.913372 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

8. Subsection Summary
Superset time: 0.91096 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 27
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

9. Section Summary
Superset time: 0.910074 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 8
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

10. Problems attempted per section/subsection
Superset time: 0.509738 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

11. Partial and full views per video
Superset time: 0.481967 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 40
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

12. Video views per section/subsection
Superset time: 0.477667 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

13. Current Enrollees
Superset time: 0.327181 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'emission_time'}, {'column': 'enrollment_status'}, {'column': 'course_key'}]

14. Cumulative Enrollments by Track
Superset time: 0.322249 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

15. Enrollees per Enrollment Track
Superset time: 0.189191 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 3
Filters: [{'column': 'emission_time'}, {'column': 'enrollment_status'}, {'column': 'course_key'}]

The only report that got killed for memory was "Watched Video Segments":

2024-04-18 19:38:19,771:ERROR:performance_metrics:Error fetching slice data: Watched Video Segments. Error: Error: :HTTPDriver for https://kwzev57t01.us-east-2.aws.clickhouse.cloud:8443 returned response code 500)
 Code: 241. DB::Exception: Memory limit (total) exceeded: would use 7.29 GiB (attempt to allocate chunk of 4906248 bytes), maximum: 7.20 GiB

Interestingly this consistently loads in the UI.

"Distribution of Current Course Grade" consistently failed, or returned no data in 0 seconds in spite of it actually returning data in the dashboard.

Full log: perf_runs_2.log