programminghistorian / ph-submissions

The repository and website hosting the peer review process for new Programming Historian lessons
http://programminghistorian.github.io/ph-submissions
135 stars 110 forks source link

Visualizing data with R and ggplot2 #606

Open hawc2 opened 2 months ago

hawc2 commented 2 months ago

Programming Historian in English has received a proposal for a lesson, 'Visualizing data with R and ggplot2,' by @rogorido and @nabsiddiqui.

I have circulated this proposal for feedback within the English team. We have considered this proposal for:

We are pleased to have invited @rogorido and @nabsiddiqui to develop this Proposal into a Submission under the guidance of @semanticnoodles as editor.

The Submission package should include:

We ask @rogorido and @nabsiddiqui to share their Submission package with our Publishing team by email, copying in @semanticnoodles.

We've agreed a submission date of April. We ask @rogorido and @nabsiddiqui to contact us if they need to revise this deadline.

When the Submission package is received, our Publishing team will process the new lesson materials, and prepare a Preview of the initial draft. They will post a comment in this Issue to provide the locations of all key files, as well as a link to the Preview where contributors can read the lesson as the draft progresses.

If we have not received the Submission package by April, @semanticnoodles will attempt to contact @rogorido and @nabsiddiqui. If we do not receive any update, this Issue will be closed.

Our dedicated Ombudspersons are Ian Milligan (English), Silvia Gutiérrez De la Torre (español), Hélène Huet (français), and Luis Ferla (português) Please feel free to contact them at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudspersons will have no impact on the outcome of any peer review.

semanticnoodles commented 2 months ago

I confirm @rogorido and @nabsiddiqui shared with me access to their repository containing all the required files, and that I handed them over to @anisa-hawes to allow the publishing team to generate the preview, thanks.

anisa-hawes commented 2 months ago

Hello Giulia @semanticnoodles, Igor @rogorido and Nabeel @nabsiddiqui,

Many thanks for sharing the lesson submission materials with me. I've now checked the Markdown file, and add some key elements of metadata. I've also checked the accompanying images and assets, ensuring each element meets our requirements.

You can find the key files here:

You can review a Preview of the lesson here:

--

A few initial notes:

anisa-hawes commented 2 months ago

Hello again Igor @rogorido and Nabeel @nabsiddiqui.

What's happening now?

Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit.

In this Phase, your editor Giulia @semanticnoodles will read your lesson, and provide some initial feedback. Giulia will post feedback and suggestions as a comment in this Issue, so that you can revise your draft in the following Phase 3: Revision 1.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Manager (@anisa-hawes) 
All  Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@semanticnoodles)  
Expected completion date? : April 20
Section Phase 3 <br> Revision 1
Who's responsible? : Authors (@rogorido + @nabsiddiqui) 
Expected timeframe? : ~30 days after feedback is received

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

rogorido commented 2 months ago

@anisa-hawes Thanks for your comments. As for the tsv file: no, it is not required. It can be deleted.

I'll add the alternative captions. Thanks.

rogorido commented 1 month ago

I added captions and alt texts (10a6a9e1b0c9fa794637837338bd7a61b7f6c5d7), but Nabeel should take a look whether it looks 'Englishly' enough...

semanticnoodles commented 1 month ago

Hello @rogorido and @nabsiddiqui,

here follows my preliminary feedback; I am aware it is quite extensive, but I believe these indications could help you strengthen your tutorial. If you need any clarification, please do not hesitate to ask!

Overall feedback

In general, your tutorial provides valuable guidance on navigating and producing a wide range of visualisations, effectively walking through the various features of ggplot2. The piece meets the accessibility and inclusivity goals of the Programming Historian fairly well, and in most cases the language is easy to understand and straightforward. However, some elements need further work, mostly falling under two intertwined aspects discussed in the following paragraphs.

Usability: Enhancing the logical structure of the lesson

In my opinion, this is the most critical point to consider. The tutorial lacks a cohesive element to tie its components together and the organisation of the content could benefit from a more linear and less convoluted approach. The case study you propose (sister cities) seems to be just a tool to obtain a series of visualisations. This is fair enough, but it could benefit from further methodological contextualisation and unpacking: the people following your tutorial may not be historians not have a clear understanding of the methods you are using -- although they can be familiar with R.

In terms of improving the overall content, I think there are two possible directions for you to consider: either revising the content to follow a visualisation task-based narrative or placing more emphasis on the structure of the case study. The first option would privilege the visualisation tasks (but still require some methodological support for the case study), while the second would require you to generate stronger and sharper research questions from the case study, to be answered (at least in part) by the visualisation tasks. I think @nabsiddiqui did a very good job of structuring the content in the lesson Data Wrangling and Management in R, so I would recommend keeping that in mind as a reference.

The title of the proposal could benefit from being more specific - or at least mentioning the context of application. The table of contents looks unbalanced: the headings and their actual wording could be better aligned with the content they cover, and the nesting could be more linear.

You give very clear information about the concept of the grammar of graphics - this is really the cornerstone of understanding how ggplot2 is designed. I really appreciate you explaining this and including many useful resources, although I think they could be arranged more organically, instead of including relatively short hints throughout the tutorial, as they tend to overshadow the walkthrough steps on several occasions.

Sustainability: Critically reviewing the data analysis narrative

The dataset looks more than adequate for the visualisation tasks you have set as objectives, but the data narrative and its wording could benefit from further tuning. What you offer in this lesson is mostly visualisation of data distributions and there is little statistical testing involved. As your topic is sister cities, it makes perfect sense to talk about relationships, although what you observe are mostly trends or tendencies that you could try to explain through further research; sometimes you clearly point that out and sometimes it looks rather implicit. I think this is just a matter of fine-tuning the language, nothing more.

Section-specific feedback

Para stands for paragraph number; please refer to the preview generated by @anisa-hawes

Introduction, Lesson Goals and Data

ggplot2: General Overview

Sister cities in Europe

Loading Data with readr

Creating a bar graph

Other Geoms: Histograms, Distribution Plots and Boxplots

Manipulating the Look of Graphs

Scales: Colors, Legends, and Axes

Faceting a Graph

Themes: Changing Static Elements

Extending ggplot2 with Other Packages

Additional Resources

Format & style

Two quick comments on the form and style.

Thank you for the great work done so far!

rogorido commented 1 month ago

@semanticnoodles thanks for your extensive comments. I will have a look at the enhancements you're proposing in the next days.

anisa-hawes commented 1 month ago

What's happening now?

Hello Igor @rogorido and Nabeel @nabsiddiqui. Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1.

This Phase is an opportunity for you to revise your draft in response to @semanticnoodles's initial feedback. You can make direct commits to your file here: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md. @charlottejmc or I are here to help if you encounter any practical problems!

When both of you + Giulia are happy with the revised draft, we will move forward to Phase 4: Open Peer Review.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@semanticnoodles) 
All  Phase 1 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Authors (@rogorido + @nabsiddiqui)  
Expected completion date? : May 17
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC) 
Expected timeframe? : ~60 days after request is accepted

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

semanticnoodles commented 1 week ago

Hello Igor @rogorido and Nabeel @nabsiddiqui, I hope you are doing well!

Just checking in with you about the draft revision (Phase 3 / Revision 1) as the deadline of the 17th of May has passed. If you need some extra time let me know approximately how much, so we can set up a new deadline -- and @anisa-hawes or @charlottejmc can update the Mermaid timeframe.

If you have doubts or need any clarification, please do not hesitate to keep in touch.

nabsiddiqui commented 1 week ago

Hello @semanticnoodles,

I have tried to rework a lot of the tutorial. I feel that changing some of the headings will make the flow more obvious. Let me see if it makes sense the way I have done it or if there should be additional changes. Here are some of what I reviewed based on your timeline. The rest I will leave to @rogorido unless he has an objection:

Introduction, Lesson Goals and Data

ggplot2: General Overview

Sister cities in Europe

Loading Data with readr

Creating a bar graph

Other Geoms: Histograms, Distribution Plots and Boxplots

Manipulating the Look of Graphs

Scales: Colors, Legends, and Axes

Faceting a Graph

Themes: Changing Static Elements

Extending ggplot2 with Other Packages

Additional Resources

Format & style

Two quick comments on the form and style.

Other

anisa-hawes commented 3 days ago

Thank you, @nabsiddiqui!

@semanticnoodles will review these revisions and advise if we are ready to move onwards to the next Phase of the workflow (which will be Phase 4 Open Peer Review). Giulia is away this week, returning on June 3rd.

In the meantime, @charlottejmc and I can help with ensuring that functions and arguments are typographically consistent. These are aspects we always check as part of typesetting at Phase 6, but we'll do a quick scan now so that this isn't a distraction for Reviewers.

charlottejmc commented 3 days ago

Hello @nabsiddiqui and @semanticnoodles,

I've made some adjustments to add backticks to functions, arguments and other parts of code, trying to stay consistent with our house style.