r-hub / cranlogs

Download Logs from the RStudio CRAN Mirror
https://r-hub.github.io/cranlogs/
Other
80 stars 13 forks source link

Option to count only current, unarchived packages #52

Open lindbrook opened 4 years ago

lindbrook commented 4 years ago

For convenience's sake, "archived" refers to either past versions of current, active packages or all versions of inactive packages found at https://cran.r-project.org/src/contrib/Archive/

While there's probably some interest in and use of "archived" versions (e.g., compatibility), my sense is that "archived" packages may be overrepresented in the CRAN download logs. I would think they would be rare and occasional rather than regular and frequent. While related to #45, in this case I'm referring to instances where "archived" packages are downloaded in their entirety. Below are two examples. Try other dates (e.g. "2019-12-06", "2019-12-08") and packages (you may have to look up the current version).

vars <- c("date", "time", "size", "version", "ip_id") date <- "2019-12-04"

pkg <- "cranlogs" current.ver <- "2.1.1" # published on 2019-04-29 sample_log <- packageRank::packageLog(pkg, date) sample_log <- sample_log[order(sample_log$version, sample_log$ip_id), ] sample_log[sample_log$version != current.ver & sample_log$size > 1000, vars]

date time size version ip_id 5092411 2019-12-04 13:57:51 14099 2.0.0 1248 5092830 2019-12-04 13:57:51 14100 2.0.0 1248 3969680 2019-12-04 21:12:16 3559 2.0.0 2209 236723 2019-12-04 15:32:59 17450 2.1.0 1248 237393 2019-12-04 15:32:59 17449 2.1.0 1248 4168022 2019-12-04 14:29:30 17560 2.1.0 1248 4170377 2019-12-04 14:29:30 17561 2.1.0 1248 3581717 2019-12-04 05:19:04 19854 2.1.0 2209 3969681 2019-12-04 21:12:16 5645 2.1.0 2209 5063409 2019-12-04 05:04:01 19840 2.1.0 2209 3586015 2019-12-04 20:00:40 17096 2.1.0 3646

pkg <- "HistData" current.ver <- "0.8-4" # published on 2018-04-04 sample_log <- packageRank::packageLog(pkg, date) sample_log <- sample_log[order(sample_log$version, sample_log$ip_id), ] sample_log[sample_log$version != current.ver & sample_log$size > 1000, vars]

date time size version ip_id 3397012 2019-12-04 22:12:06 128701 0.6-11 2209 98322 2019-12-04 16:36:10 233746 0.6-12 44 3551489 2019-12-04 13:51:11 233690 0.6-12 44 4877876 2019-12-04 07:36:01 233685 0.6-12 1164 3397013 2019-12-04 22:12:06 137419 0.6-12 2209 3397014 2019-12-04 22:12:06 142693 0.6-13 2209 3397015 2019-12-04 22:12:06 144321 0.6-14 2209 3397016 2019-12-04 22:12:06 105128 0.6-4 2209 3397017 2019-12-04 22:12:06 108639 0.6-5 2209 3397018 2019-12-04 22:12:07 122814 0.6-7 2209 3397019 2019-12-04 22:12:07 123349 0.6-8 2209 4228171 2019-12-04 11:21:15 192318 0.6-9 44 3397020 2019-12-04 22:12:07 124259 0.6-9 2209 3397021 2019-12-04 22:12:07 144828 0.7-0 2209 2839810 2019-12-04 23:07:22 251000 0.7-3 44 397676 2019-12-04 09:04:26 250995 0.7-3 1164 3397022 2019-12-04 22:12:07 146243 0.7-3 2209 5099407 2019-12-04 13:59:16 251731 0.7-5 1248 5100200 2019-12-04 13:59:16 251732 0.7-5 1248 571097 2019-12-04 04:43:59 253220 0.7-5 2209 3222200 2019-12-04 04:54:47 253220 0.7-5 2209 3397023 2019-12-04 22:12:07 146560 0.7-5 2209 3397024 2019-12-04 22:12:07 148196 0.7-6 2209 3397025 2019-12-04 22:12:07 344875 0.7-8 2209 3765857 2019-12-04 05:07:05 450034 0.7-8 2209 3397026 2019-12-04 22:12:07 353161 0.8-0 2209 2194293 2019-12-04 05:23:06 476708 0.8-1 2209 3397027 2019-12-04 22:12:08 356170 0.8-1 2209 3413294 2019-12-04 01:47:43 356145 0.8-2 44 4092977 2019-12-04 15:37:57 356140 0.8-2 1248 4093061 2019-12-04 15:37:57 356139 0.8-2 1248 3397028 2019-12-04 22:12:08 236694 0.8-2 2209