chainsawriot / rstyle

The evolution of R programming styles.
43 stars 2 forks source link

within package variation #21

Closed yenchiayi closed 4 years ago

yenchiayi commented 4 years ago
yenchiayi commented 4 years ago

The preliminary result on the latest submission, after normalizing entropy into [0, 1]

image

yenchiayi commented 4 years ago

Visualize the naming variation of the 20 most popular packages

image

chainsawriot commented 4 years ago

Visualize the naming variation of the 20 most popular packages

* source: https://www.r-pkg.org/downloaded

* it is clear that lower_snake is the dominating naming style, which is far leading the lowerCamel case that used to be the most popular naming style.

image

* [path](https://github.com/chainsawriot/rstyle/blob/entro/visualization_fun/within_pkg_naming_variation.png): visualization_fun/within_pkg_naming_variation.png

@exilespacer Thank so much for this. But the problem of "popular packages" from r-pkg is that nearly all of them are from RStudio (and especially, from a particular team of RStudio. rsconnect is also an RStudio product, but it is from the Shiny team. Thus lowerCamel is used.)

Could we have another chart as a supplement to this, that is, using the top 20 packages with the highest PageRank instead?

yenchiayi commented 4 years ago

@chainsawriot Here is a figure of within-packages naming variation for the packages of the largest pagepank.

image

yenchiayi commented 4 years ago

ok, the below figure seems to not support my hypothesis.

I think it will be interesting to look at the relation between the age of packages and the most popular naming convention of the packages. I have a feeling that it will be positively related to the order of (dotted.func -> lowerCamel -> lower_snake)

image