Teaching Team feedback - Githubissues

Team D,

Thank you for the very detailed proposal.

And, of course, your project can be about visualizing a dataset of interest to you (similar to
your first project). The only rule about this dataset is that you need to be the one who create
the dataset (through crawling, for example). Furthermore, as we have already done this in
project 1, if you choose this path and want to get decent score, you should attempt to
visualize a very challenging dataset, such as creating visualization for 3+ dimensions,
visualize social network, text, etc.. Plotting non-interactive charts and visualize a dataset with
bar plot/ pie chart will not give you high score for project 2

This is a requirement of project 2. While you can visualize a dataset, you should go one step further, i.e. crawl/ collect it thru any method.

Since this is tidytuesday dataset, I think other ppl have attempted to visualize this dataset before --> It will be more difficulty to see the novelty/ new things that your project bring to us.

One suggestion is that you can crawl additional data to complement this dataset. For instance, the information on educational attainment etc, can be a possible supplementary data source.

Or you can pusue another direction --> Build a model to estimate pay for each sector, by gender, by educational attainment and thus, can highlight the pay gap?

Please revise the detailed instructions for more information about what you can do, feel free to ask me if you unsure about your proposal. But you can check group A,B or J. They are doing either modelling, lession plan or crawling data. These works might give you some inspirations on how you can improve this version of proposal.

We will be following the 2 suggestions:

Suggestion 1 (crawl additional data): We consider getting data on salary range by gender and by sector from another source. If the dataset that the additional data source is based on has a larger population sample (still of the UK) than the current dataset, then we think it will be usable to integrate with our data.
The additional data will be used to address Suggestion 2 (build a model): Pursuing an additional direction. We will build a model to predict the expected pay gap per sector is expected to have, 1 year ahead of the latest available data (we are only predicting 1 year ahead due to the lack of available data). If we can get the data like Suggestion 1, then we will also answer the hypothetical question: "If I were a male/female working in sector , I would have an estimated pay of in the next year"

khaukhau / COMP4010-Project-2

Teaching Team feedback #1