vlead / analytics-db

This repository will hold the documents and specifications for installing ELK stack for analytics purposes
Other
1 stars 2 forks source link

Notes from analytics data model review #23

Open mrityunjaypalash opened 6 years ago

mrityunjaypalash commented 6 years ago

I reviewed the current data model, mostly from the perspective of answering questions from usability and learnability perspective. It is very basic and doesn't answer the questions that will be asked. I am writing down my initial notes here. We should discuss and see where this fits.

Typical questions that will be asked of data

  1. Who is using which course, when and how much? - filter by region, type of access (workshop, lab, at home), subject (CS, EE, etc.), a few others
  2. Which labs are being used most? - filter by region, access type, subject, region, a few others
  3. What part of a lab are being used most (theory, experiment, quiz, etc.)?
  4. What is feedback from user (or workshop admin, or workshop participants) about courses?
  5. Funnel questions - How many users take quiz before experiment, how many read theory followed by experiment followed by quiz, etc.
  6. What kind of users are using the course? user attributes - region, college, other attributes as available
  7. Correlation between user attribute and consumption patterns - Are students who spend more time with content have good score in quiz? How is feedback related to quiz response?

(Need to add more and structure it according to Ravi's taxonomy)

Stakeholders who will ask these questions

  1. MHRD
  2. Lab owner/creator
  3. Instructor in lab
  4. College
  5. Student
  6. Workshop administrator
  7. Public (or other external stakeholder)

Key objects in the system that participate in data generation

  1. User - user id (logged in user)
  2. Resource - URI (course, a page in the course, an experiment..)
  3. Session - session id (contiguous use of resource by a single user - like a browser session)
  4. Event - event id
  5. Course
  6. Experiment

Data required to answer these questions

  1. Content consumption events ('accessed a lab', 'submitted a quiz', 'read the instructions', 'clicked on a link', etc.)
  2. User feedback submission events
  3. Outreach/Workshop feedback submission events