alan-turing-institute / TuringDataStories

TuringDataStories: An open community creating “Data Stories”: A mix of open data, code, narrative 💬, visuals 📊📈 and knowledge 🧠 to help understand the world around us.
Other
39 stars 12 forks source link

[WIP] EThOS PhD thesis metadata analysis #178

Open mhauru opened 2 years ago

mhauru commented 2 years ago

Summary

Copy the notebook from https://github.com/mhauru/EThOS-analysis/blob/master/analysis.ipynb, make minimal edits to make it run. This is a starting point of the PhD thesis metadata (EThOS) story.

List of changes proposed in this PR (pull-request)

What should a reviewer concentrate their feedback on?

Acknowledging contributors

review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:21Z ----------------------------------------------------------------

Camila: Rerun on latest data (https://bl.iro.bl.uk/concern/datasets/bb0b3ec4-4667-436a-8e6a-d2e8e5383726?locale=en)


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:22Z ----------------------------------------------------------------

Camila: Check how many people have a full first name. If many have it, we could get gender data based on them, and analyse that.


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:23Z ----------------------------------------------------------------

Camila: Check if qualification is always PhD.


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:23Z ----------------------------------------------------------------

Camila/Markus: This plot needs improving. Some points, ideas:

  • colours are ugly
  • could bin by decade
  • could rolling average by decade
  • could colour Russell group unis, or Turing network unis
  • the plot has some strange features that look like something is happening in the data, but it's more an artefact of the plotting

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:24Z ----------------------------------------------------------------

Maybe put in one figure? Maybe rolling average to smooth?


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:25Z ----------------------------------------------------------------

Keep buzzwords in this story, move all network analysis stuff to its own story.


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:26Z ----------------------------------------------------------------

Try turning this into a figure.


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:26Z ----------------------------------------------------------------

Present more nicely, less long list, more plotty


review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

mhauru commented on 2022-04-14T10:47:27Z ----------------------------------------------------------------

This plot is hard to interpret.


crangelsmith commented 2 years ago

@mhauru and @crangelsmith met on the 14th of April and decided to turn this notebook into two stories into two parts.

First part:

  • Descriptive stats and figures.
  • Add an analysis on gender if possible
  • First pass a title analysis using word counts.

Second part:

  • Title analysis using networks (second part of this notebook).

The comments above help guide the curation of the stories. We will open a new PR with the first part story and keep this one open until the second story is ready for review.

mhauru commented 2 years ago

Todo list:

  • [x] Update data to latest versions
  • [ ] Do gender analysis
  • [ ] Remove everything that will be in part 2
  • [ ] Make plots prettier
  • [ ] Go through review comments in ReviewNB
  • [ ] Clean up imports (once part 2 is gone)