ropensci-review-tools / pkgstats

Historical statistics of every R package ever
https://docs.ropensci.org/pkgstats/
17 stars 1 forks source link

Using `pkgstats_from_archive` on a local CRAN mirror? #25

Closed rpodcast closed 2 years ago

rpodcast commented 2 years ago

Hello, I was interested in running pkgstats_from_archive using a new local CRAN mirror I downloaded on my local server as a way to generate and explore the existing CRAN package metrics, but I am having trouble understanding how the path should be constructed. I thought that simply supplying the file path to my local CRAN mirror would be enough, but I am receiving an error that the path must contain a tarballs directory. The top level directory/file structure looks like below. Is there a better way to execute the function? I am running the latest GitHub version on the main branch in R 4.1.0.

eric@system76-pc /m/n/r/cran_mirror> tree -L 1
.
├── banner.shtml
├── banner.shtml.in
├── bin
├── contrib
├── CRANlogo.pdf
├── CRANlogo.png
├── CRANlogo.svg
├── CRAN_mirrors.csv
├── doc
├── faqs.html
├── favicon.ico
├── Google-Logo_40wht.gif
├── Google-Logo_40wht.png
├── help
├── html
├── index.html
├── logo.html
├── manuals.html
├── mirmon
├── mirmon_report.html
├── mirmon_report_old_release.html
├── mirmon_report_release.html
├── mirror-howto.html
├── mirrors-foot.html
├── mirrors-head.html
├── mirrors.html
├── navbar.html
├── oai2_style.xsl
├── other-docs.html
├── other-software.html
├── R.css
├── report_cran.html
├── report_www.html
├── Rlogo.jpg
├── Rlogo.svg
├── robots.txt
├── R-release-1.0.0.html
├── _search.html
├── search.html
├── sources.html
├── sources.html.in
├── src
├── statistics.html
├── submit-frameless.html
├── submit.html
├── TIME
├── web
└── welcome.html
mpadge commented 2 years ago

Thanks @rpodcast for asking. That would appear to be the skeleton structure of a CRAN mirror for which you then have to run the following command (or equivalent) to actually populate the mirror:

 rsync -e "ssh" -rtlzv --delete cran-rsync@cran.r-project.org: /dir/on/local/disc 

That will download all tarballs into a directory named "tarballs". Alternatively, if you can wait until next week, we'll have the full thing published here which you can download and use straight away. It will then be regularly updated and released as a binary package file with this package, on at least a monthly basis.

rpodcast commented 2 years ago

I actually did use rsync to pull down a CRAN mirror (albeit without ssh): rsync -rtlzvh --delete cran.r-project.org::CRAN /mnt/storage/r_opensource_projects/cran_mirror/ Ah I might see the issue: I have a trailing / at the end of my local path. I'll try putting the files in a tarballs dir and see if that works.

rpodcast commented 2 years ago

I was able to run the function with the above fix to my local path. We can close this issue now. Thank you for your prompt feedback!