[x] C2-1: The article "A Computational Analysis of theDynamics of R Style Based on 94 MillionLines of Code from All CRAN Packagesin the Past 20 Years" provides a useful contribution to the development of a concensus-based style for writing R code. The findings are an interesting read for both beginners and seasoned R developers, but the communication needs to be improved and in part the method must be explained in more detail to be fully understandable. A major change that I see as needed is the organisation of the provided material (via GitHub) in a more accessible form. I suggest to use an R package for that. The approach in itself and the used technology are sound. The manuscript seems to build upon existing work and provide new insights.
[draft] key update for organization
(1) prefix of different analysis: e.g. comm, fun, syn_
(2) global variables in config.R
(3) path management by here
[x] C2-2: I was able to execute most the code provided in yen.R by cloning the linked code repository and running rjournal_submission/figures.R - the latter file matches the submitted script, but the relative file imports can be resolved then. The figures 1 to 5 were successfully recreated, but the clustering took too long for me to wait for.
[draft]
(1) we've fixed the path problem in figures.R
(2) we provide estimated time to wait for the community clustering
[x] C2-3: First sentence in the abstract is a bit colloquial, suggest to reword to sth. like "the flexibility of R and the diversity of the R community leads to a large number of programming styles applied in R packages and scripts".
Thank you so much for the suggestion. We have followed reviewer's suggestion and amended the first sentence in the abstract as such.
[x] C2-4: The abstract lacks one sentence about the results: what are the key aspects of your concensus-based style?
We have added one sentence to give a glimpse of the consensus-based style.
[ ] C2-5: Final sentence of Introduction: "more effective efforts" - do you refer to standardization efforts? Suggest to repeat the term fully. More importantly: whate are the benefits of standardization, and why does it need to be more effective? How do you measure the effectiveness? What are the risks you see in the diversity of the particular style guides (beyond the generic arguments on reusability etc.) ?
[draft] highlight that we don't have strong opinion on it.
benefits of standardization: easier to collaborate, avoid misunderstanding, better readability
why it needs to be more effective: we observe that there are key influencers, such as rstudio, that shapes programming styles heavily. However, even within rstudio teams, the styles vary largely. example like shiny team and tidyverse team. These key influencers should pay more attention to their huge impact on the whole R community. And if they are willing to do so, we believe a consensus-based styles guides can converge even faster and effectively.
how do measure effectiveness: do we need to measure that? It's just a concept?
risks in the diversity: then it requires more efforts and time in understanding each other's codes for the collaboration among different communities
[x] C2-6: Use active language where possible, e.g., "In our local mirror, it contains" >> "Our local mirror contains ..."
[x] C2-7: Introduction example: suggest to introduce even more variation than simply the different assignment operators, for example use "sumOfSquare(number = x)" and "sum_of_square(number=x)", and do not use "return()" in one of the examples. (a50e16e)
We have addde more variations to the two examples.
[x] C2-8: "There are still some other" > use this sentence to make clear how your work goes beyon Baath 2012, e.g. "Beyond naming conventions investigated by Baath, there are style elements ..."
[x] C2-9: Same sentence: use only "e.g." or "etc.", but not both. This occurs repeatedly in the text, please double check.
[x] C2-10: Since your work is strongly connected to R programming style guides, I would expect a current but complete list of style guides. Baath 2012 mentions one further one; please explain why you did not include that one, or how you made your selection. (b748bf7be41f4db47db2c2cd4fdf18b800939f14 / 5047a46)
In footnote X, we have included all the style guides we found on the internet. The reason for us to focus only on the 3 is also included in footnote X: these 3 are arguably the most influential ones.
[x] C2-11: Reference to the bauugwo package is missing, presumably https://github.com/chainsawriot/baaugwo (strongly suggest to make this package properly citable, e.g., via Zenodo)
[x] C2-12: In Table 1: can you explain how you created the list of "Features" you compare? Is this a complete list of the recommendations in the style guides, or is this the elements you picked for your study? I suggest to use formatting to help identify where there are actual differences between the guides, they seem to agree in at least four features. Please don't let the table stand on it's own, but explain the main pieces of information in the text.
[x] C2-13: "In January 2019, we cloned a local mirror of CRAN": please provide the exact date. (b748bf7be41f4db47db2c2cd4fdf18b800939f14) Would it be possible to recreate your dataset using MRAN ? That would make your work much more reproducible, though it is not clear to me if that works with the historic analysis you do. Please discuss limitations such as these in your discussion section.
[x] C2-14: "packages delisted" > please double-check CRAN terms/policy, maybe you mean "orphaned packages", or are orphaned just a part of the "delisted" ones? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
We have followed the reviewer's advice to use the terms as stated in the CRAN Repository Policy. We have moved the term "delisted" but to use official adjectives "orphaned/archived".
[x] C2-15: Please clarify why you need to use a "one submission per year" rule, when you took a snapshot of CRAN at a specific point in time.
[draft] Since we conduct a temporal analysis of the dynamic changes in programming styles, we decided to analyze the functions and codes from only one submission for all packages in order to keep a comparable basis across packages. We choose the latest submissions for each year for each package as representative samples.
[x] C2-16: The examples for the considered style-elements are really helpful, but also quite extensive. Consider reducing the number of examples by having more than one element in one piece of code (just an idea), and in any case consider adding highlighting to the source code to assist the reader.
[x] C2-17: Not all style elements that are investigated are listed in Table 1, e.g., fx_integer. You should use one consistent set of criteria - this also makes clear if you contributed new criteria in your work that are not covered by the style guides! Make those original contributions very clear.
[x] C2-18: "By not considering" > suggest to rephrase, e.g. "If not considering line-length, the remaining 10 binary ... leave 7169 possible combinations of PSVs that a programmer could employ.
[x] C2-19: Is it possible with your approach to study consistency of criteria within packages? How many of the packages are using the same style in all their functions?
[YEN] see the response of C1-2
[x] C2-20: Please explain why do you generate communities on your own, and do not use other starting points, e.g., the CRAN Task Views.
[YEN]Task Views only covers a limited number of packages
[x] C2-21: In the README you state that the clustering cannot be replicated because of a missing random seed. Why can you not set a seed now, and run the clustering again? The overall results should not change dramatically.
[YEN] see issue #25
[x] C2-22: Why do you use the 18 largest, not 15 or 20? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
[YEN] just a convenient choice. No much difference
[x] C2-23: How many communities did your workflow identify besides the chosen 18?
[YEN] total 931 communities identified. (While now we use the largest 20 communities for analysis)
[x] C2-24: What percentage of CRAN packages are covered by the 18 communities?
[YEN] Top 20 identified communities have covered 88% of total 14491 packages. (While now we use the largest 20 communities for analysis)
[x] C2-25: Please mention how long the analysis runs on your hardware, and how much storage the CRAN mirror rougly takes, so curious users are informed. I stopped my rsync process when I read that fule size of CRAN is approx 240 GB in the CRAN Mirrow HOWTO. This would also help to understand why you have not repeated the workflow to include 2019 before submitting your work.
Questions on analysis decisions
Questions, and potential aspects for discussion/fututure work.
[x] C2-26: You dataset is code from R packages. Have you considered adding R scripts into the dataset? Would you expect different styles for scripts or packages? It might not be easy to find those scripts (could search data repositories like Zenodo or Figshare) though.
[x] C2-27: I'm interested to hear your thoughts about extending your analysis to include automatic code formatting: I have no idea how to measure this, but wouldn't it be interesting to know how many packages use, e.g., the styler package? Maybe it's a thought for future work, or maybe it is something you can quickly add.
[x] C2-28: Did you consider adding orphaned/archived packages to the dataset? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
Results
[x] C2-29: Considering the computational and storage efforts of your workflow, please properly deposit your derived datasets in a data repository, e.g., Zenodo. It would be a shame if this interesting dataset just disappeared.
[x] C2-30: Figure 1: Please clarifiy (in the figure caption) what each ratio is; for a ratio, you the reader needs ot be aware of both values put in relation to each other, and by this point of reading, I am not familiar enough with your method to just know that. Is "fx_tab_ratio" = "number of functions using tabs / number of functions using spaces", for example?
[x] C2-31: "[..] relatively more impressive.", "dethroned", "To our surprise" - use more neutral language, e.g., "stronger". It is good to point to unexpected results, but please explain: why are you surprised? How many functions do all Bioconductor packages have compared to all non-Bioconductor packages?
[x] C2-32: Figure 3: suggest to rename the labels from to "Comment line" and "Code line"; the GIF is a good idea, but is is confusing that the colors between Figure 3 and the GIF at https://github.com/chainsawriot/rstyle/blob/master/file60f331694ef5.gif are switched; also use the same legend on both plots, please
[ ] C2-33: With you workflow, could you easily distinguish between line length of function definition vs. line length of function body?
[x] C2-34: As a member of the R-Sig-Geo community, I must object to the name "GPS and Geography"; "Spatial" is much better, though I understand if you want to avoid reusing names of Task Views, so maybe "Geo"/"Geospatial" also works
[YEN] TBA, check issue #26
[ ] C2-35: There are further intersting patterns in Figure 4, e.g. in "Graphics" or "Java". It would be great if you could share your insights on those in the text, too.
[YEN] TBA, related to C1-6 "others" naming in the early days
what happened to "others" in rJave community??? no idea
[x] C2-36: Figure 5: the different y-scales in the rows can be quickly overseen, strongly suggest to point the reader to that fact in the figure's caption, or consider using the same range, as in Figure 1.
[YEN] we have updated the y-scale in fig. 5 accordingly.
Discussion
[x] C2-37: "the latest snapshop" > use the time of the actual snapshot
[x] C2-38: "intruduction of a new language.." sentence misses "feature"?
[x] C2-39: When discussing the "=" assignment operator vs. style guides, what is your take on
the timeline, because there are 16 years between the one and the other. The style guides came after the feature.
[x] C2-41: "very resistant" - suggest to use more neutral language, not implying intend that you cannot know (in fact, a user survey would be an interesting complement to your work); as you later explain, this approach could also be very "careful" or "wise"
[x] C2-42: What fraction of all functions adhere to your "Zeitgeist style"? Can you capture, 80% of all functions or more?
[x] C2-43: "make code not accessible" > "inaccessible"
References
[x] C2-44: Reference "The state of naming conventions in r." should probably have a capital letter "R", as well as "Java" in "open source java programmers" > please double-check references.
[x] C2-45: I cannot access Wang and Hahn, 2017 - please check if there is an open version of the article and link to it.
Package:
[x] C2-46: As said above, I think the work would be much more accessible and more sustainably useful if the workflow was structured and packages as an actual R package. The last bit's of the workflow that are not R-based, e.g., the Makefile, could be easily scripted in R, too.
Code in yen.R
[ ] C2-47: Add a download of the dataset using git2r package, to make your script immediately executable.
[x] C2-48: The code in the GitHub project should be organised in the form of an R package; the large number of files at the root level make it hard to inspect and which files are relevant for the article at hand.
[x] C2-49: The file "tab1.csv" can not be found using working directory at rstyle/rjournal_submission, and neither in the GitHub repository.
[x] C2-50: The creation of the table/clustering took too long for me to complete.
[x] C2-1: The article "A Computational Analysis of theDynamics of R Style Based on 94 MillionLines of Code from All CRAN Packagesin the Past 20 Years" provides a useful contribution to the development of a concensus-based style for writing R code. The findings are an interesting read for both beginners and seasoned R developers, but the communication needs to be improved and in part the method must be explained in more detail to be fully understandable. A major change that I see as needed is the organisation of the provided material (via GitHub) in a more accessible form. I suggest to use an R package for that. The approach in itself and the used technology are sound. The manuscript seems to build upon existing work and provide new insights.
[x] C2-2: I was able to execute most the code provided in
yen.R
by cloning the linked code repository and runningrjournal_submission/figures.R
- the latter file matches the submitted script, but the relative file imports can be resolved then. The figures 1 to 5 were successfully recreated, but the clustering took too long for me to wait for.[x] C2-3: First sentence in the abstract is a bit colloquial, suggest to reword to sth. like "the flexibility of R and the diversity of the R community leads to a large number of programming styles applied in R packages and scripts".
[ ] C2-5: Final sentence of Introduction: "more effective efforts" - do you refer to standardization efforts? Suggest to repeat the term fully. More importantly: whate are the benefits of standardization, and why does it need to be more effective? How do you measure the effectiveness? What are the risks you see in the diversity of the particular style guides (beyond the generic arguments on reusability etc.) ?
[x] C2-6: Use active language where possible, e.g., "In our local mirror, it contains" >> "Our local mirror contains ..."
[x] C2-7: Introduction example: suggest to introduce even more variation than simply the different assignment operators, for example use "sumOfSquare(number = x)" and "sum_of_square(number=x)", and do not use "return()" in one of the examples. (a50e16e)
[x] C2-8: "There are still some other" > use this sentence to make clear how your work goes beyon Baath 2012, e.g. "Beyond naming conventions investigated by Baath, there are style elements ..."
[x] C2-9: Same sentence: use only "e.g." or "etc.", but not both. This occurs repeatedly in the text, please double check.
[x] C2-10: Since your work is strongly connected to R programming style guides, I would expect a current but complete list of style guides. Baath 2012 mentions one further one; please explain why you did not include that one, or how you made your selection. (b748bf7be41f4db47db2c2cd4fdf18b800939f14 / 5047a46)
[x] C2-11: Reference to the bauugwo package is missing, presumably https://github.com/chainsawriot/baaugwo (strongly suggest to make this package properly citable, e.g., via Zenodo)
[x] C2-12: In Table 1: can you explain how you created the list of "Features" you compare? Is this a complete list of the recommendations in the style guides, or is this the elements you picked for your study? I suggest to use formatting to help identify where there are actual differences between the guides, they seem to agree in at least four features. Please don't let the table stand on it's own, but explain the main pieces of information in the text.
[x] C2-13: "In January 2019, we cloned a local mirror of CRAN": please provide the exact date. (b748bf7be41f4db47db2c2cd4fdf18b800939f14) Would it be possible to recreate your dataset using MRAN ? That would make your work much more reproducible, though it is not clear to me if that works with the historic analysis you do. Please discuss limitations such as these in your discussion section.
[x] C2-14: "packages delisted" > please double-check CRAN terms/policy, maybe you mean "orphaned packages", or are orphaned just a part of the "delisted" ones? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
[x] C2-15: Please clarify why you need to use a "one submission per year" rule, when you took a snapshot of CRAN at a specific point in time.
[x] C2-16: The examples for the considered style-elements are really helpful, but also quite extensive. Consider reducing the number of examples by having more than one element in one piece of code (just an idea), and in any case consider adding highlighting to the source code to assist the reader.
[x] C2-17: Not all style elements that are investigated are listed in Table 1, e.g., fx_integer. You should use one consistent set of criteria - this also makes clear if you contributed new criteria in your work that are not covered by the style guides! Make those original contributions very clear.
[x] C2-18: "By not considering" > suggest to rephrase, e.g. "If not considering line-length, the remaining 10 binary ... leave 7169 possible combinations of PSVs that a programmer could employ.
[x] C2-19: Is it possible with your approach to study consistency of criteria within packages? How many of the packages are using the same style in all their functions?
[x] C2-20: Please explain why do you generate communities on your own, and do not use other starting points, e.g., the CRAN Task Views.
[x] C2-21: In the README you state that the clustering cannot be replicated because of a missing random seed. Why can you not set a seed now, and run the clustering again? The overall results should not change dramatically.
[x] C2-22: Why do you use the 18 largest, not 15 or 20? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
[x] C2-23: How many communities did your workflow identify besides the chosen 18?
[x] C2-24: What percentage of CRAN packages are covered by the 18 communities?
[x] C2-25: Please mention how long the analysis runs on your hardware, and how much storage the CRAN mirror rougly takes, so curious users are informed. I stopped my rsync process when I read that fule size of CRAN is approx 240 GB in the CRAN Mirrow HOWTO. This would also help to understand why you have not repeated the workflow to include 2019 before submitting your work.
Questions on analysis decisions
Questions, and potential aspects for discussion/fututure work.
[x] C2-26: You dataset is code from R packages. Have you considered adding R scripts into the dataset? Would you expect different styles for scripts or packages? It might not be easy to find those scripts (could search data repositories like Zenodo or Figshare) though.
[x] C2-27: I'm interested to hear your thoughts about extending your analysis to include automatic code formatting: I have no idea how to measure this, but wouldn't it be interesting to know how many packages use, e.g., the styler package? Maybe it's a thought for future work, or maybe it is something you can quickly add.
[x] C2-28: Did you consider adding orphaned/archived packages to the dataset? (b748bf7be41f4db47db2c2cd4fdf18b800939f14)
Results
[x] C2-29: Considering the computational and storage efforts of your workflow, please properly deposit your derived datasets in a data repository, e.g., Zenodo. It would be a shame if this interesting dataset just disappeared.
[x] C2-30: Figure 1: Please clarifiy (in the figure caption) what each ratio is; for a ratio, you the reader needs ot be aware of both values put in relation to each other, and by this point of reading, I am not familiar enough with your method to just know that. Is "fx_tab_ratio" = "number of functions using tabs / number of functions using spaces", for example?
[x] C2-31: "[..] relatively more impressive.", "dethroned", "To our surprise" - use more neutral language, e.g., "stronger". It is good to point to unexpected results, but please explain: why are you surprised? How many functions do all Bioconductor packages have compared to all non-Bioconductor packages?
[x] C2-32: Figure 3: suggest to rename the labels from to "Comment line" and "Code line"; the GIF is a good idea, but is is confusing that the colors between Figure 3 and the GIF at https://github.com/chainsawriot/rstyle/blob/master/file60f331694ef5.gif are switched; also use the same legend on both plots, please
[ ] C2-33: With you workflow, could you easily distinguish between line length of function definition vs. line length of function body?
[x] C2-34: As a member of the R-Sig-Geo community, I must object to the name "GPS and Geography"; "Spatial" is much better, though I understand if you want to avoid reusing names of Task Views, so maybe "Geo"/"Geospatial" also works
[ ] C2-35: There are further intersting patterns in Figure 4, e.g. in "Graphics" or "Java". It would be great if you could share your insights on those in the text, too.
[x] C2-36: Figure 5: the different y-scales in the rows can be quickly overseen, strongly suggest to point the reader to that fact in the figure's caption, or consider using the same range, as in Figure 1.
Discussion
[x] C2-37: "the latest snapshop" > use the time of the actual snapshot
[x] C2-38: "intruduction of a new language.." sentence misses "feature"?
[x] C2-39: When discussing the "=" assignment operator vs. style guides, what is your take on the timeline, because there are 16 years between the one and the other. The style guides came after the feature.
[ ] C2-40: "very strong path dependency" - unclear meaning, suggest to rephrase
[x] C2-41: "very resistant" - suggest to use more neutral language, not implying intend that you cannot know (in fact, a user survey would be an interesting complement to your work); as you later explain, this approach could also be very "careful" or "wise"
[x] C2-42: What fraction of all functions adhere to your "Zeitgeist style"? Can you capture, 80% of all functions or more?
[x] C2-43: "make code not accessible" > "inaccessible"
References
[x] C2-44: Reference "The state of naming conventions in r." should probably have a capital letter "R", as well as "Java" in "open source java programmers" > please double-check references.
[x] C2-45: I cannot access Wang and Hahn, 2017 - please check if there is an open version of the article and link to it.
Package:
[x] C2-46: As said above, I think the work would be much more accessible and more sustainably useful if the workflow was structured and packages as an actual R package. The last bit's of the workflow that are not R-based, e.g., the
Makefile
, could be easily scripted in R, too.Code in yen.R
[ ] C2-47: Add a download of the dataset using git2r package, to make your script immediately executable.
[x] C2-48: The code in the GitHub project should be organised in the form of an R package; the large number of files at the root level make it hard to inspect and which files are relevant for the article at hand.
[x] C2-49: The file "tab1.csv" can not be found using working directory at rstyle/rjournal_submission, and neither in the GitHub repository.
[x] C2-50: The creation of the table/clustering took too long for me to complete.