benmarwick / ctv-archaeology

CRAN Task View: Archaeological Science
146 stars 43 forks source link

proposal for a new CRAN CTV on Archaeological Science has now been submitted #71

Open benmarwick opened 2 months ago

benmarwick commented 2 months ago

You can see it here: https://github.com/cran-task-views/ctv/issues/64 I read a bunch of recently accepted CTV and edited the scope slightly to match what I saw in those CTVs

nfrerebeau commented 1 month ago

Following on from the discussion of the proportion of GitHub projects relative to CRAN packages, here are two CRAN packages we missed:

benmarwick commented 1 month ago

Thanks @nfrerebeau I've added those now.

Hello everyone @scpederzani @SCSchmidt @lolosp @lsteinmann @LiYingWang @bbartholdy @joeroe @samleggs22 @scpederzani here is an update on the submission of our proposal to the CRAN CTV maintainers.

In brief, they would like to see no more than 20% of the packages on non-CRAN repositories. Currently we have about 47%, so we need to

Among our group of maintainers I counted about 11 GitHub packages that we maintain, @joeroe, me, @SCSchmidt and @lsteinmann. So one way to get started on this could be for us to get those 11 on CRAN, and remove a bunch of others, and perhaps ask the @ISAAKiel group if they might put some of theirs on CRAN to help us get close to 20%.

What do you think? Please let me know your thoughts!

Code for calculations ``` # get the text of the CTV ctv_url <- "https://raw.githubusercontent.com/benmarwick/ctv-archaeology/master/Archaeology.md" # import into R ctv_url_tbl <- scan(ctv_url, what = character()) # convert to scalar ctv_url_tbl_txt <- paste0(ctv_url_tbl, collapse = " ") # count how many github pkgs n_github_pkgs <- stringr::str_count(ctv_url_tbl_txt, "r github\\(") # count how many cran pkgs n_cran_pkgs <- stringr::str_count(ctv_url_tbl_txt, "r pkg\\(") # what's the percentage of github pkgs currently? n_github_pkgs / (n_github_pkgs + n_cran_pkgs) # currently 47.5% # target is <20% so how many github packages need to go to CRAN? n_github_pkgs - (0.2 * (n_github_pkgs + n_cran_pkgs)) # ~ 28 github packages need to go to CRAN to get to 20% github pkgs # how many github packages to remove to get to 20% n_github_pkgs - (n_cran_pkgs * 0.25) # ~ 34 # joeroe 6 # benmarwick 3 # SCSchmidt 1 # lsteinmann 1 ```
bbartholdy commented 1 month ago

If we do filter out no updates in the last five years, I think this should only be for software packages and not data packages, since the latter don't really need updating to the same extent?

lolosp commented 1 month ago

What about a filtering criterion based on compatibility with current R and dependencies releases, i.e. filter out packages that are no longer functional? I would be happy to put some time towards testing these. The cool thing about the CTV is finding useful packages that are not on CRAN so would be good to try and keep those if possible...

bbartholdy commented 1 month ago

We could also take a look at some of the data packages and see if we can get them on CRAN, since these should be relatively straightforward to submit and maintain?

lsteinmann commented 1 month ago

I could prepare my GitHub-clayrings-data-thing for CRAN, but to be honest, I feel it does not make so much sense, because it is so very very tiny and overly specific.

I would be happy to help a bit with anyone else getting something CRAN-ready?

nfrerebeau commented 1 month ago

Submitting a bunch of packages to CRAN would be the ideal solution. Personnally, I've never had any issue with the submission process and I find it pretty smooth (despite the fact that CRAN always asks for mandatory fixes during fieldwork). However, I may be subject to survivorship bias. My packages are relatively simple to maintain (e.g. no system library dependency) and there are examples of CRAN maintainers beeing quite harsh (no name needed here, but there is a phrase for that). So I'd understand if package maintainers prefer to build their own CRAN-like repository (with R-universe or drat). Asking maintainers to submit their packages to CRAN also implies a long-term commitment, as we don't want these packages to be archived within the next 6 months.

That beeing said, CRAN is the centerpiece of the quality of the R ecosystem, thanks to its stringent standards. It also makes things easier for beginners, as you only need to use install.packages() to get started.

We should not only consider submitting new packages to CRAN, but also pruning GitHub projects. As @lolosp suggests, we can filter out GitHub projects that are no longer functional. Maybe we can also filter out projects without DOI (i.e. that are not properly archived) and emphasis peer review (e.g. rOpensci packages).

I suspect that most GitHub projects are listed at https://open-archaeo.info. We can add this link in the CTV preamble to let interested people discover more R resources.

joeroe commented 1 month ago

It's a fair point about the proportion of CRAN packages. I know some of reasonable objections to CRAN and prefer e.g. r-universe or just distribute source code on GitHub, but I'd still say that CRAN is the de facto standard repository for R and therefore important from an accessibility and reproducibility point of view. Something we should be trying to encourage with this CTV, in other words.

I plan to submit all my packages to CRAN anyway, so I can make a push to do so with all of them listed here. But I have to warn that realistically I won't be able to do so before the end of October.

nfrerebeau commented 1 month ago

While trying to increase the number of packages on CRAN, we could reduce the number of GitHub packages to move the CTV submission forward (we can always add more packages later)?

I've tried to make a small selection here: https://github.com/nfrerebeau/ctv-archaeology/commit/36fc32d850b002bbc0b398bb8e03442b15e7e8a6

I have kept projects that have been peer-reviewed or have been on CRAN and are currently archived. I've also left some projects that I felt were the least redundant with the CRAN packages, but this is highly opinionated.

Whatever the criteria, the cut is significant. There are still about 25% of GitHub packages, but maybe the CTV editors will be understanding :sweat_smile:

What do you think?

benmarwick commented 1 month ago

Thanks everyone, looks like between us we can get a few more packages on to CRAN (thanks @joeroe and @lsteinmann!) and drop a few GitHub-only packages (thanks @nfrerebeau for making a start on this, great idea to mention https://open-archaeo.info in the CTV preamble).

Perhaps we should set a three month deadline for some of us to work on getting some of our packages to CRAN, and then update the proposal and see where we are at in terms of the requirements of the CRAN CTV reviewers.

Here are some GitHub packages from our group that we might be able to get onto CRAN:

Joe, let me know if some of these you know will never go to CRAN:

These are mine that I'm confident I can get accepted to CRAN:

Lisa, go for it!

And here some packages from others that I will contact and see if they are interested to submit to CRAN also:


joeroe commented 1 month ago

I think we should retain c14bazAAR and bleiglas, at least. They're both mature and published packages. As far as I understand @nevrome prefers to keep them off CRAN on principle, not because they're not ready. The review process for becoming an rOpenSci package is at least as rigorous as CRAN's.

@benmarwick I won't submit swapdata to CRAN. The rest I think are doable.

nevrome commented 1 month ago

Cool that you're all so actively collaborating on this CTV! Would love to help with this issue, but I stopped doing CRAN submissions entirely. While I get the arguments voiced above by @nfrerebeau I made this decision based on the following experiences and considerations:

Sorry for the rant. I just wanted to voice this once to explain my decision. Feel free to include or remove my packages from the CTV. I will keep maintaining them on a per-demand basis.

benmarwick commented 1 month ago

Thanks @nevrome, no worries at all, that's helpful. Sorry to hear of your negative experiences with the CRAN maintainers. I'll cross yours off our list of pkgs to consider getting to CRAN.