ropensci-review-tools / pkgstats

Historical statistics of every R package ever
https://docs.ropensci.org/pkgstats/
17 stars 1 forks source link

[Question]: I cannot find statistics outlined in README #54

Closed pawelru closed 9 months ago

pawelru commented 9 months ago

There is a section "Statistics on individual objects (including functions)" that describes statistics that I cannot find in the output. Can you please guide me where should I look?

https://github.com/ropensci-review-tools/pkgstats/blob/10a12453a68e918d83d91c8e34ffbf6b924ab71a/README.md?plain=1#L224-L249

This is the structure of the output I have:

r$> lapply(x, names)
$loc
 [1] "language"    "dir"         "nfiles"      "nlines"      "ncode"       "ndoc"        "nempty"      "nspaces"     "nchars"      "nexpr"       "ntabs"       "indentation"

$vignettes
[1] "vignettes" "demos"    

$data_stats
[1] "n"           "total_size"  "median_size"

$desc
 [1] "package"    "version"    "date"       "license"    "urls"       "bugs"       "aut"        "ctb"        "fnd"        "rev"        "ths"        "trl"        "depends"    "imports"    "suggests"   "enhances"   "linking_to"

$translations
NULL

$objects
 [1] "file_name"       "fn_name"         "kind"            "language"        "loc"             "npars"           "has_dots"        "exported"        "param_nchars_md" "param_nchars_mn" "num_doclines"   

$network
[1] "file"             "line1"            "from"             "to"               "language"         "cluster_dir"      "centrality_dir"   "cluster_undir"    "centrality_undir"

$external_calls
[1] "tags_line" "call"      "tag"       "file"      "kind"      "start"     "end"       "package"  

I also checked the main vignette and there is nothing about this so maybe README has too much?

r$> packageVersion("pkgstats")
[1] ‘0.1.3.9’
mpadge commented 9 months ago

That whole section of the README is under the main heading "overview-of-statistics-and-the-pkgstats_summary-function", which starts with the line

s <- pkgstats_summary (p)

All of those statistics are produced from that function. Here's an example showing a bit more detail of what the main pkgstats function returns, and how it directly yields all of the statistics described in the README via the pkgstats_summary function:

path <- file.path (tempdir (), "geodist")
path <- gert::git_clone("https://github.com/hypertidy/geodist", path = path)
p <- pkgstats::pkgstats(path)

# Statistics on individual objects are in 'objects':
head (p$objects) # all 'objects'
#>         file_name            fn_name     kind language loc npars has_dots
#> 1 R/geodist-min.R        geodist_min function        R  19     4    FALSE
#> 2 R/geodist-vec.R        geodist_vec function        R  31     9    FALSE
#> 3 R/geodist-vec.R   check_vec_inputs function        R  11     3    FALSE
#> 4 R/geodist-vec.R geodist_paired_vec function        R  14     5    FALSE
#> 5 R/geodist-vec.R    geodist_seq_vec function        R  19     4    FALSE
#> 6 R/geodist-vec.R      geodist_x_vec function        R  14     3    FALSE
#>   exported param_nchars_md param_nchars_mn num_doclines
#> 1     TRUE             100             104           43
#> 2     TRUE              97              90           64
#> 3    FALSE              NA              NA           NA
#> 4    FALSE              NA              NA           NA
#> 5    FALSE              NA              NA           NA
#> 6    FALSE              NA              NA           NA

# Plus inter-relationships between them:
head (p$network)
#>        file line1              from               to language cluster_dir
#> 1 R/utils.R    16 geodist_benchmark chk_is_num_len_1        R           1
#> 2 R/utils.R    17 geodist_benchmark chk_is_num_len_1        R           1
#> 3 R/utils.R    18 geodist_benchmark chk_is_num_len_1        R           1
#> 4 R/utils.R    32 geodist_benchmark        get_delta        R           1
#> 5 R/utils.R    41 geodist_benchmark          geodist        R           1
#> 6 R/utils.R    70         get_delta          geodist        R           1
#>   centrality_dir cluster_undir centrality_undir
#> 1              0             1               21
#> 2              0             1               21
#> 3              0             1               21
#> 4              0             1               21
#> 5              0             1               21
#> 6              0             1                0
# ... plus other information in 'external_calls'

# And this gives all the statistics you're asking for:
pkgstats::pkgstats_summary (p)
#>   package   version                       date            license files_R
#> 1 geodist 0.0.8.016 2023-11-24 09:25:42.829406 MIT + file LICENSE       6
#>   files_src files_inst files_vignettes files_tests loc_R loc_src loc_inst
#> 1        15          0               1           6   383    3327       NA
#>   loc_vignettes loc_tests blank_lines_R blank_lines_src blank_lines_inst
#> 1           207       538           100             517               NA
#>   blank_lines_vignettes blank_lines_tests comment_lines_R comment_lines_src
#> 1                    22                61             212               432
#>   comment_lines_inst comment_lines_vignettes comment_lines_tests rel_space
#> 1                 NA                      23                  16 0.2008785
#>   rel_space_R rel_space_src rel_space_inst rel_space_vignettes rel_space_tests
#> 1   0.1697817     0.2070255             NA           0.1941617       0.2039542
#>   indentation nexpr num_vignettes num_demos num_data_files data_size_total
#> 1           4   1.5             1         0              0               0
#>   data_size_median translations                                 urls
#> 1                0           NA https://github.com/hypertidy/geodist
#>                                          bugs desc_n_aut desc_n_ctb desc_n_fnd
#> 1 https://github.com/hypertidy/geodist/issues          2          0          0
#>   desc_n_rev desc_n_ths desc_n_trl depends imports                   suggests
#> 1          0          0          0      NA      NA knitr, rmarkdown, testthat
#>   enhances linking_to n_fns_r n_fns_r_exported n_fns_r_not_exported n_fns_src
#> 1       NA         NA      47                4                   43       116
#>   n_fns_per_file_r n_fns_per_file_src npars_exported_mn npars_exported_md
#> 1              4.8           7.733333                 5                 4
#>   loc_per_fn_r_mn loc_per_fn_r_md loc_per_fn_r_exp_mn loc_per_fn_r_exp_md
#> 1        18.95745              16                  26                  27
#>   loc_per_fn_r_not_exp_mn loc_per_fn_r_not_exp_md loc_per_fn_src_mn
#> 1                18.30233                      16          31.87931
#>   loc_per_fn_src_md languages doclines_per_fn_exp_mn doclines_per_fn_exp_md
#> 1                26         C                  40.75                   40.5
#>   doclines_per_fn_not_exp_mn doclines_per_fn_not_exp_md doclines_per_fn_src_mn
#> 1                          0                          0               1.067114
#>   doclines_per_fn_src_md docchars_per_par_exp_mn docchars_per_par_exp_md
#> 1                      0                    93.6                      90
#>   n_edges n_edges_r n_edges_src n_clusters centrality_dir_mn centrality_dir_md
#> 1     279        34         245          5          71.54179          14.33333
#>   centrality_dir_mn_no0 centrality_dir_md_no0 centrality_undir_mn
#> 1              112.1357                45.375            198.8988
#>   centrality_undir_md centrality_undir_mn_no0 centrality_undir_md_no0
#> 1                  74                 254.554                124.6097
#>   num_terminal_edges_dir num_terminal_edges_undir node_degree_mn node_degree_md
#> 1                    101                       61       1.381188              1
#>   node_degree_max                     external_calls cpl_instability_pkg
#> 1              11 base:39:19,geodist:29:24,stats:3:2            0.218638

Created on 2023-11-24 with reprex v2.0.2

With that, I'll close this issue, but feel free to ask any more questions you might have regarding this output. Also feel free to submit a pull request updating the README text if you feel that anything remains unclear to you, and you have any suggestions on how it could be improved. Thanks!