alan-turing-institute / rds-course

Materials for Turing's Research Data Science course
https://alan-turing-institute.github.io/rds-course/
31 stars 13 forks source link

Answering the RQ with the European QoL data #41

Closed callummole closed 2 years ago

callummole commented 3 years ago

We are tight on time, and much of the content creation for each modules does not rely on the EQOL data.

However, one of the most important aspects of the course is the development of the research question (#15). Based on Aldabe et al., 2011, the RQ starts broad, is scoped in M1 (#25), with candidate variables selected in one of M1-M3, is explored wrt the dataset in M3 (#24), and then is assessed via simple models in M4 (#30). This thread will make some of the taught material, but importantly the majority of the hands-on sessions.

There is a need for work to be done to get to know the dataset and how it can be used to answer the RQ. The code developed and insights gained will be massively useful for both the taught material and scoping appropriate hands-on tasks. This work can be done roughly in parallel to creating the taught content. to The task consists of (but is not limited to):

It would be great to use pandas, numpy, matplotlib/seaborn, and scikit-learn. These are common packages that we will be using throughout other modules.

In terms of workflow, please branch off develop.

It is also expected that the nature of the work will evolve as the taught material is developed and more knowledge is gained about the dataset.

callummole commented 3 years ago

@ChristinaLast is assigned to this project for the next few weeks to help with this