Open hawc2 opened 8 months ago
I confirm @rogorido and @nabsiddiqui shared with me access to their repository containing all the required files, and that I handed them over to @anisa-hawes to allow the publishing team to generate the preview, thanks.
Hello Giulia @semanticnoodles, Igor @rogorido and Nabeel @nabsiddiqui,
Many thanks for sharing the lesson submission materials with me. I've now checked the Markdown file, and add some key elements of metadata. I've also checked the accompanying images and assets, ensuring each element meets our requirements.
You can find the key files here:
You can review a Preview of the lesson here:
--
A few initial notes:
## Header 2
is the largest.alt_text
+ captions for each of your images. We have committed to providing alt-text for all figure images, plots and graphs included in our lessons, so you'll need to add this as part of your revisions. These notes on Descriptive Alt text may be useful to you. .tsv
and a .csv
version of the dataset, although only the .csv
appears to be used in the lesson. Is the .tsv
alternative required too? Hello again Igor @rogorido and Nabeel @nabsiddiqui.
Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit.
In this Phase, your editor Giulia @semanticnoodles will read your lesson, and provide some initial feedback. Giulia will post feedback and suggestions as a comment in this Issue, so that you can revise your draft in the following Phase 3: Revision 1.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Manager (@anisa-hawes)
All Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@semanticnoodles)
Expected completion date? : April 20
Section Phase 3 <br> Revision 1
Who's responsible? : Authors (@rogorido + @nabsiddiqui)
Expected timeframe? : ~30 days after feedback is received
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
@anisa-hawes Thanks for your comments. As for the tsv file: no, it is not required. It can be deleted.
I'll add the alternative captions. Thanks.
I added captions and alt texts (10a6a9e1b0c9fa794637837338bd7a61b7f6c5d7), but Nabeel should take a look whether it looks 'Englishly' enough...
Hello @rogorido and @nabsiddiqui,
here follows my preliminary feedback; I am aware it is quite extensive, but I believe these indications could help you strengthen your tutorial. If you need any clarification, please do not hesitate to ask!
In general, your tutorial provides valuable guidance on navigating and producing a wide range of visualisations, effectively walking through the various features of ggplot2
. The piece meets the accessibility and inclusivity goals of the Programming Historian fairly well, and in most cases the language is easy to understand and straightforward. However, some elements need further work, mostly falling under two intertwined aspects discussed in the following paragraphs.
In my opinion, this is the most critical point to consider. The tutorial lacks a cohesive element to tie its components together and the organisation of the content could benefit from a more linear and less convoluted approach. The case study you propose (sister cities) seems to be just a tool to obtain a series of visualisations. This is fair enough, but it could benefit from further methodological contextualisation and unpacking: the people following your tutorial may not be historians not have a clear understanding of the methods you are using -- although they can be familiar with R.
In terms of improving the overall content, I think there are two possible directions for you to consider: either revising the content to follow a visualisation task-based narrative or placing more emphasis on the structure of the case study. The first option would privilege the visualisation tasks (but still require some methodological support for the case study), while the second would require you to generate stronger and sharper research questions from the case study, to be answered (at least in part) by the visualisation tasks. I think @nabsiddiqui did a very good job of structuring the content in the lesson Data Wrangling and Management in R, so I would recommend keeping that in mind as a reference.
The title of the proposal could benefit from being more specific - or at least mentioning the context of application. The table of contents looks unbalanced: the headings and their actual wording could be better aligned with the content they cover, and the nesting could be more linear.
You give very clear information about the concept of the grammar of graphics - this is really the cornerstone of understanding how ggplot2
is designed. I really appreciate you explaining this and including many useful resources, although I think they could be arranged more organically, instead of including relatively short hints throughout the tutorial, as they tend to overshadow the walkthrough steps on several occasions.
The dataset looks more than adequate for the visualisation tasks you have set as objectives, but the data narrative and its wording could benefit from further tuning. What you offer in this lesson is mostly visualisation of data distributions and there is little statistical testing involved. As your topic is sister cities, it makes perfect sense to talk about relationships, although what you observe are mostly trends or tendencies that you could try to explain through further research; sometimes you clearly point that out and sometimes it looks rather implicit. I think this is just a matter of fine-tuning the language, nothing more.
Para stands for paragraph number; please refer to the preview generated by @anisa-hawes
readr
head(eudata)
could support your explanation about the observations occurring in the dataset – this is also considered good practice in data science.typecountry
column included in your dataset. I tested the walkthrough using the data contained in the eu
column, just remember to send us the correct version of the dataset.[ ] Para 31, penultimate line: comma missing space afterwards.
[ ] Para 33, please review this for clarity (here you should mention why you used log10 once for all or put it into another spot. Consider explaining why none of the methods is ideal)
This leads to an uninformative histogram. We can take
log10(dist)
as our variable or filter to exclude values above 5000kms. None of these methods is ideal, but as far as we know, we are operating with manipulated data making it less problematic
[ ] Para 36, please review it for clarity (it reads implicitly why you employed ECDF).
[ ] Para 41, same issue: you refer to ANOVA without explaining why you foresee that as a viable statistic test, cutting the paragraph short.
[ ] Review the heading accordingly with the edits.
ggplot2
does not use discrete colour scales at all.Two quick comments on the form and style.
ggplot2
that always comes lowercased, but you know it 😄)code format
or not, you choose. Consistency is the only requirement.Thank you for the great work done so far!
@semanticnoodles thanks for your extensive comments. I will have a look at the enhancements you're proposing in the next days.
Hello Igor @rogorido and Nabeel @nabsiddiqui. Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1.
This Phase is an opportunity for you to revise your draft in response to @semanticnoodles's initial feedback. You can make direct commits to your file here: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md. @charlottejmc or I are here to help if you encounter any practical problems!
When both of you + Giulia are happy with the revised draft, we will move forward to Phase 4: Open Peer Review.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@semanticnoodles)
All Phase 1 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Authors (@rogorido + @nabsiddiqui)
Expected completion date? : May 17
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC)
Expected timeframe? : ~60 days after request is accepted
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
Hello Igor @rogorido and Nabeel @nabsiddiqui, I hope you are doing well!
Just checking in with you about the draft revision (Phase 3 / Revision 1) as the deadline of the 17th of May has passed. If you need some extra time let me know approximately how much, so we can set up a new deadline -- and @anisa-hawes or @charlottejmc can update the Mermaid timeframe.
If you have doubts or need any clarification, please do not hesitate to keep in touch.
Hello @semanticnoodles,
I have tried to rework a lot of the tutorial. I feel that changing some of the headings will make the flow more obvious. Let me see if it makes sense the way I have done it or if there should be additional changes. Here are some of what I reviewed based on your timeline. The rest I will leave to @rogorido unless he has an objection:
readr
head(eudata)
could support your explanation about the observations occurring in the dataset – this is also considered good practice in data science.typecountry
column included in your dataset. I tested the walkthrough using the data contained in the eu
column, just remember to send us the correct version of the dataset.[X] Para 31, penultimate line: comma missing space afterwards.
[X] Para 33, please review this for clarity (here you should mention why you used log10 once for all or put it into another spot. Consider explaining why none of the methods is ideal)
This leads to an uninformative histogram. We can take
log10(dist)
as our variable or filter to exclude values above 5000kms. None of these methods is ideal, but as far as we know, we are operating with manipulated data making it less problematic
[X] Para 36, please review it for clarity (it reads implicitly why you employed ECDF).
[X] Para 41, same issue: you refer to ANOVA without explaining why you foresee that as a viable statistic test, cutting the paragraph short.
[X] Review the heading accordingly with the edits.
ggplot2
does not use discrete colour scales at all.Two quick comments on the form and style.
ggplot2
that always comes lowercased, but you know it 😄)code format
or not, you choose. Consistency is the only requirement.Thank you, @nabsiddiqui!
@semanticnoodles will review these revisions and advise if we are ready to move onwards to the next Phase of the workflow (which will be Phase 4 Open Peer Review). Giulia is away this week, returning on June 3rd.
In the meantime, @charlottejmc and I can help with ensuring that functions and arguments are typographically consistent. These are aspects we always check as part of typesetting at Phase 6, but we'll do a quick scan now so that this isn't a distraction for Reviewers.
Hello @nabsiddiqui and @semanticnoodles,
I've made some adjustments to add backticks to functions, arguments and other parts of code, trying to stay consistent with our house style.
Hello everybody, I am back! While I was away I got the chance to go through the tutorial and I can say you did upgrade the lesson quite a lot. Brilliant work @nabsiddiqui and @rogorido -- and many many thanks to @charlottejmc and @anisa-hawes for their support!
I will take another quick reading as I think I spotted another couple of small things to fix, but I believe now it is almost ready to move onwards to Phase 4. Sorry for the slight delay in my answer -- I will get back to you in a few hours.🖥
It took longer than expected (hours became days..). Nevertheless, if @rogorido and @nabsiddiqui can quickly fix the elements in the list below I believe we can move to the open peer review (Phase 4). The most urgent is the first element, the following are about simple formalities/typos.
typecountry
columnThank you for the patience!
@semanticnoodles (and @nabsiddiqui): I have already corrected all typos (I hope). And I have the correct dataset. But my question is: where should I exactly upload it?
Many thanks for your work!
Hello @rogorido, thank you for making these corrections.
You can replace the current sistercities.csv
file in your lesson's associated assets folder, here.
If you prefer, however, you could send the file directly to me (publishing.assistant[@]programminghistorian.org) and I can upload it for you.
Thank you!
@charlottejmc Thanks for your answer. I have uplodaded it to the assets folder. I hope everything is OK now... (commit: 387fdd9)
Hello Igor @rogorido and Nabeel @nabsiddiqui,
Your lesson has been moved to the next phase of our workflow which is Phase 4: Open Peer Review.
This phase is an opportunity for you to hear feedback from peers in the community.
Giulia @semanticnoodles will invite two reviewers to read your lesson/translation, test your code, and provide constructive feedback. In the spirit of openness, reviews will be posted as comments in this issue (unless you specifically request a closed review).
After both reviews, Giulia will summarise the suggestions to clarify your priorities in Phase 5: Revision 2.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 3 <br> Revision 1
Who worked on this? : Authors (@rogorido + @nabsiddiqui)
All Phase 3 tasks completed? : Yes
Section Phase 4 <br> Open Peer Review
Who's working on this? : Reviewers (@justinwigard + @regan008)
Expected completion date? : August 31
Section Phase 5 <br> Revision 2
Who's responsible? : Authors (@rogorido + @nabsiddiqui)
Expected timeframe? : ~30 days after editor's summary
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
@anisa-hawes Thanks. No problem with the comments being posted here.
During Phases 2 and 3, I provided initial feedback on this lesson, then worked with Igor @rogorido and Nabeel @nabsiddiqui, to complete a first round of revisions. In Phase 4 Open Peer Review, we invite feedback from others in our community.
Welcome to Justin Wigard @justinwigard and Amanda Regan @regan008 ! By participating in this peer review process, you are contributing to the creation of a useful and sustainable technical resource for the whole community. Thank you ✨ Please read the lesson, test the code, and post your review as a comment in this issue by August 31st.
Reviewer Guidelines:
A preview of the lesson:
Notes:
This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.
Programming Historian in English is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or request clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience.
We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. If anyone witnesses or feels they have been the victim of the above described activity, please contact our ombudsperson Dr Ian Milligan. Thank you for helping us to create a safe space.
Wonderful! Thank you @semanticnoodles for organizing this.
Thank you also to @justinwigard and @regan008 for agreeing to serve as reviewers. I have enjoyed both of your research and scholarships and look forward to your comments. Please let me know if you need anything in the mean time.
@semanticnoodles I'm running late with this, but I promise to get to it this week!
@semanticnoodles I'm running late with this, but I promise to get to it this week!
Hi @regan008, thanks for keeping us posted!
@semanticnoodles I'm running late with this, but I promise to get to it this week!
Thanks, @regan008!
@semanticnoodles Thanks again for asking me to review this -- it was a pleasure to read and engage with. @rogorido and @nabsiddiqui, I can't wait to assign this lesson to my own students. It is an excellent overview of the ggplot package and the concepts related to the Grammar of Graphics. Congratulations!
I highly recommend publication. I do think the lesson loses the reader slightly between paragraphs 56-60. I would recommend taking another look at those to see if they can be massaged to be a bit more in line with the skill level for the rest of the lesson. And lastly, I think perhaps in the geom section, the authors should briefly discuss line charts. I'm not sure the data will make that easy, but I think many readers will be historians who will want to look at change over time in their charts.
Here are some more detailed line-level comments:
Please let me know if you have any questions or need anything else from me. I look forward to promoting this lesson once it is live!
Thank you, Amanda @regan008, for your insightful review and enthusiasm! I will hold off on posting my wrap-up until both the reviews are published, but this is excellent!
Overall, I think this tutorial is excellent. I’m likewise hoping to assign this tutorial in my own classes, and even learned a few new ways to think about my own approach to working with data!
I highly recommend publication. I have divided my review into three brief sections based on the Programming Historian review guidelines: Surface, Functional, and Code. I know my review has a lot of little items, but they’re primarily lightweight and surface, aimed at just tightening up an already streamlined tutorial. Take or leave as needed, I trust y’all’s judgement.
One further point I wanted to highlight: while working through the code, I noticed that my own visualizations differed slightly from those in the tutorial, whether that was due to a slightly updated dataset or due to something on my side. I flagged those in the Functional section and the Code section primarily, just to get a second set of eyes on them – functionally, everything works, it’s just a few odd visual discrepancies! The attached Appendix files demonstrate what visualizations are happening on my side.
What a great submission. Please let me know if I can help further, and so looking forward to sharing it when things are completed!
eudata
showing what the tibble should look like? Might not be needed, but I figured I’d check!
Thank you so much Justin @justinwigard for such a detailed review, it is fantastic. I cannot wait to get started wrapping up the brilliant points you and @regan008 raised!
Thank you @justinwigard and @regan008. @rogorido and I will begin working on this soon.
@justinwigard and @regan008 Thank you very much for the detailed corrections!
Hi @rogorido and @nabsiddiqui, here is my review/feedback summary (it took a while); thanks a million @regan008 and @justinwigard for all the food for thought and complementary feedback you provided! Both of you highly recommend the lesson for publication 🎉🎉🎉: @regan008 appreciates particularly the explanations about the tibbles and the Grammar of Graphics; on the other hand, @justinwigard appreciates the engaging tone of the lesson and the way it explains the potential of ggplot2.
Here is a quick recap of the core elements you highlight -- that I recommend @rogorido & @nabsiddiqui to go through carefully.
@regan008 makes some detailed comments about typos, potential clarifications (e.g., on plotting packages, coordinate systems), and a suggestion to link out where ECDF is mentioned, clarifying the contents of para 56-60. She also notes that while maps are mentioned, the lesson does not cover them explicitly (might be a chance to link to Using Geospatial Data to Inform Historical Research in R).
There may be an opportunity to use additional line charts, as @regan008 suggests, but requiring further transformations/brand new additions, e.g. using long/lat or population size between sister cities. The structure of the lesson works and I would like you to prioritise the refinements she suggests rather than adding brand new extensions. She makes a good point, but please only add additional data filtering/visualisation if you have time to devote to the task.
@justinwigard highlights a number of areas where the lesson is already strong, as well as offering thoughtful suggestions for improvement under the four sections he articulated. Surely the minor typographical and grammatical suggestions other than the consistency of the sister cities spelling and the geoms require your attention.
On a functional level, he notes that some additional context could be helpful for readers unfamiliar with the tidyverse or Wikidata. He noted that providing counter-examples alongside some of the figures, like Figure 6, could help readers compare different cases, as well as adding more references on the choice of binwidth size (very often a rule of thumb, in my experience). He additionally suggests listing the tidyverse packages explicitly, and including a link to Wikidata, making more evident the line about the dataset download. He also suggests incorporating a screenshot to show how the tibble should appear after loading (I believe I suggested you to consider something similar previously, like running head(eudata)
, it might be really worth getting a screenshot). Many more technical insights from his side follow, and I suggest you have a look at them carefully.
Again, as I noted in Amanda's feedback, please focus on refinement/consolidation first, and then consider expanding your lesson further.
Here are a few extra comments from my side, mostly technically oriented. Following @justinwigard notes I ran all the code to see if I could provide some extra technical feedback (using R version 4.3.0 [2023-04-21] on my RStudio version Cranberry Hibiscus, 2024.9.0.375).
13081 x 15
, with the following colnames (I believe the index X could be removed from the dataset).> colnames(eudata)
[1] "X"
[2] "origincityLabel"
[3] "origincountry"
[4] "originlat"
[5] "originlong"
[6] "originpopulation"
[7] "sistercityLabel"
[8] "destinationlat"
[9] "destinationlong"
[10] "destinationpopulation"
[11] "destination_countryLabel"
[12] "dist"
[13] "eu"
[14] "samecountry"
[15] "typecountry
.md
. eudata.filtered
(eudata missing filtered).Warning: Removed 956 rows containing missing values (geom_point())
and the same happens with the codeblock in para 73. My plots look just like the ones from @justinwigardA huge thank you for all your patience and hard work!🌟
Hello Igor @rogorido and Nabeel @nabsiddiqui,
Your lesson has been moved to the next phase of our workflow which is Phase 5: Revision 2.
This phase is an opportunity for you to revise your draft in response to the peer reviewers' feedback.
Giulia @semanticnoodles has summarised their suggestions, but feel free to ask questions if you are unsure.
Please make revisions via direct commits to your file: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md. @charlottejmc and I are here to help if you encounter any difficulties.
When you and Giulia are all happy with the revised draft, the Managing Editor @hawc2 will read it through and provide additional feedback/suggestions as necessary before we move forward to Phase 6: Sustainability + Accessibility.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 4 <br> Open Peer Review
Who worked on this? : Reviewers (@justinwigard + @regan008)
All Phase 4 tasks completed? : Yes
Section Phase 5 <br> Revision 2
Who's working on this? : Authors (@rogorido + @nabsiddiqui)
Expected completion date? : October 24
Section Phase 6 <br> Sustainability + Accessibility
Who's responsible? : Publishing Team
Expected timeframe? : 7~21 days
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
@semanticnoodles thanks for your review/feedback.We will make all corrections in the next days.
@regan008 and @justinwigard: Many thanks again for your comments and corrections. I have added many of them (cdfc89f413a9049cdd7a2120b410e9189238d1a8) and @nabsiddiqui and I should think about two or three changes you are proposing which have maybe more profound consequences for the tutorial.
In any case, just some comments:
@justinwigard:
sample_frac()
which takes a random sample out of the data. We should add a warning for the reader...@regan008:
plotly
was created mainly for python and has nowadays extensions for R, julia, etc. As far as I know, it is not very much used in R in comparison to 'native' solutions like ggplot (see here) the number of stars in github for instance). dygraphs
is also rather a interface to the dygraphs javascript library and nothing 'R-native';In any case, we will still work in some on your comments (@semanticnoodles). Many thanks again.
Thank you for your work so far, Igor @rogorido and Nabeel @nabsiddiqui ✨
Please let Giulia @semanticnoodles know when you feel you've completed the revisions. She will read through the draft again to confirm that she's satisfied with the suggestions integrated.
@anisa-hawes yes will do it!
Hello @rogorido, @anisa-hawes, and @semanticnoodles,
Igor and I have added our edits, and I believe that we are all set to move to the next stage now.
Thank you, @nabsiddiqui and @rogorido.
Giulia @semanticnoodles will read through your revisions later this week, and advise if she feels any further adjustments are needed.
After that, Alex will read it through and share additional feedback/suggestions as necessary.
When both Giulia and Alex are happy, we will move forward to Phase 6: Sustainability + Accessibility which will begin with copyediting 🙂
@anisa-hawes OK, many thanks!
Hello @rogorido & @nabsiddiqui,
I apologise for the delay in posting this feedback. I have been going through the whole lesson again with @justinwigard and @regan008 comments at hand. I think you have done a wonderful job of polishing the lesson, we are almost ready for Phase 6! 🎉
Please review the following points and we will be ready to move on - looking forward to seeing this brilliant lesson of yours available to the PH audience!
[ ] Missing rows warning: not to have the readers freaking out when they encounter `Warning: Removed xyz rows containing missing values (geom_point())
can you spend a line or so just saying the do not have to worry?
[x] Title: I understand that it might not be easy, but as I mentioned in my previous comment, I would like you to think if the title could be improved, to be more informative for the PH audience – you are doing much more than teaching how to plot graphs here! Something like Exploring and Visualizing Data in R with ggplot2 might make the difference already, but you can consider referring to the grammar of graphics or anything that (and massive thanks @anisa-hawes for the brainstorming session on this):
[x] ¶ 25: link missing a [
to be rendered
[x] ¶ 28-30: following @justinwigard observation, to make the dataset download less skippable can you put at the end of paragraph 28:
You can download the dataset at [this link](https://github.com/programminghistorian/ph-submissions/tree/gh-pages/assets/visualizing-data-with-r-and-ggplot2/sistercities.csv).
and then in paragraph 30 change the phrasing to:
Let’s go ahead and place the dataset in our project’s current working directory.
[x] ¶ 38: The paragraph seems messed up a little(trimmed?). Please check it.
[x] ¶ 39 (@regan008 ’s): I think the figure caption for Fig 1 is mixed up. This chart appears to show the count of locations not the total percentage.
[x] ¶ 44: instead of “tutorial” can you plese use its full name (Data Wrangling and Managment in R)?
[x] ¶ 49 (@justinwigard ’s): I think there’s a sentence that was unfinished, potentially? “…the column for different bars, and We also added”:
We passed a new parameter to the
ggplot()
command namedfill
, indicating the column for the bars. We also added…
Here I believe you meant something like “We mapped the origincountry
column to the fill
aesthetic in the ggplot()
command, which defines the color range of the bars. We also added…”
[x] ¶ 64: in the code chunk it’s eudata.filtered
(eudata missing filtered).
[x] ¶ 117: The Wallstreet Journal -> The Wall Street Journal
@semanticnoodles Thanks a lot for your comments. We will work on your corrections and I hope we will be ready in 2-3 days.
Hello @semanticnoodles. @rogorido and I have finished our edits. I have set a seed in the R code to allow for reproducibility. I have also updated the images to reflect the sample data the user will get due to the seed.
For the title, we were thinking perhaps "From Historical Data to Visual Analytics: The Grammar of Graphics in Practice"? I don't know what would be needed to change the title since the folders are based on the title. I am sure @anisa-hawes can help. Look forward to moving this ahead.
Thank you, @nabsiddiqui. Yes, of course we can help with the practicalities of adjustments to any file and directory names.
However, I think what Giulia @semanticnoodles is aiming towards is finding a title that is more specific. Fundamentally, we want to help readers find lessons that meet their learning goals. A clear title facilitates discovery through search, and offers a quick, basic sense of what can be learned.
Reviewing our lesson directory, I think the most successful titles generally comprise:
The current title is: Visualizing Data with R and ggplot2 Giulia has suggested the subtle adjustment: Exploring and Visualizing Data in R with ggplot2
I was wondering whether your title could clarify what kind of data readers are handling with these methods? The concept of Sister Cities is mentioned but what are you describing in general: demographic data? geographical/spatial data? ('mixed' data? - is the fact that you are selecting methods to visualise a range of different data types the key? 🤔)
My sense is that an effective lesson title is usually simple and succinct. So, I think I'd suggest avoiding the semicolon and compound structure (more often encountered for an expanded research article title) and focus on providing straight-forward keys to the lesson.
@anisa-hawes After talking with @nabsiddiqui I think we stick to the title proposed by Giulia.
to @anisa-hawes' point, it would be nice to clarify what type of data this lesson teaches how to visualize - would it be fair to label it "Demographic Data"?
I think it is more mixed data since some of it is about the cities themselves and some of it is about the demographics of the city.
I like "Exploring and Visualizing Mixed Data in R with ggplot2".
@rogorido is this ok with you?
Programming Historian in English has received a proposal for a lesson, 'Visualizing data with R and ggplot2,' by @rogorido and @nabsiddiqui.
I have circulated this proposal for feedback within the English team. We have considered this proposal for:
We are pleased to have invited @rogorido and @nabsiddiqui to develop this Proposal into a Submission under the guidance of @semanticnoodles as editor.
The Submission package should include:
We ask @rogorido and @nabsiddiqui to share their Submission package with our Publishing team by email, copying in @semanticnoodles.
We've agreed a submission date of April. We ask @rogorido and @nabsiddiqui to contact us if they need to revise this deadline.
When the Submission package is received, our Publishing team will process the new lesson materials, and prepare a Preview of the initial draft. They will post a comment in this Issue to provide the locations of all key files, as well as a link to the Preview where contributors can read the lesson as the draft progresses.
If we have not received the Submission package by April, @semanticnoodles will attempt to contact @rogorido and @nabsiddiqui. If we do not receive any update, this Issue will be closed.
Our dedicated Ombudspersons are Ian Milligan (English), Silvia Gutiérrez De la Torre (español), Hélène Huet (français), and Luis Ferla (português) Please feel free to contact them at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudspersons will have no impact on the outcome of any peer review.