cran-task-views / ctv

CRAN Task View Initiative
82 stars 13 forks source link

Process for Archiving *and Unarchiving* packages? #20

Closed ugroempi closed 2 years ago

ugroempi commented 2 years ago

Dear CRAN Task View Editors (CTVEs ;-)),

thanks for the good work on moving the task views to github, I think that this will make maintenance quite a bit easier, at least for me (though making time remains a problem).

When checking on the archived packages now, I found several packages that were previously archived but are now back on CRAN, because maintainers eventually updated them (in my case, eight packages that were archived since Nov 19 are on CRAN again). Mostly, packages were either fixed within one month or within something around six months.

I suspect that the modifications to task views because of package archiving so far involve manual work by Achim @zeileis because suitable changes to the task view text are presumably hard to automate. Achim, I admire your apparently infinite amount of energy, but wouldn't it be better and more friendly for your time to automate something like

In the light of the fix times observed in my little sample, I would probably give it a few weeks (during which the package is flagged as archived in the task view), and would either remove the archived flag if the package comes back during this wait time or would eventually remove the package from the task view.

Best, Ulrike

zeileis commented 2 years ago

Thanks for the initiative, Ulrike! A couple of comments about this:

Having said this: It would be possible to automatically change the links in a task view file when a package gets archived and to only send an e-mail when it was archived for more than, say, 4 weeks. But I'm not sure it is really worth the time and effort to do so. I'll discuss it with the CRAN team, though, to get their opinion about it. It wouldn't help with the majority of package archivals on CRAN, though, because most packages are not unarchived after 4 weeks (if at all).

ugroempi commented 2 years ago

Thanks for the quick reply! In my case, of the 20 packages that were archived since 2017, 10 got revived (one additional package is on and off). In recent years, archiving on CRAN has not only been due to unresponsiveness of maintainers in terms of answering e-mails, but also due to revision requests with ambitious deadlines that maintainers can't always keep (and some maintainers maintain packages for which they do not have full grasp of all aspects of the code, which makes it even harder); in such cases, the fix is made once the package maintainer finds time. I believe that this problem is on the rise, because CRAN has become more aggressive in terms of archiving packages for relatively minor deviations from policy.

The following comment is conditional on that it does not cause too much work, I have no idea how hard it is to automate something like this: I personally think that it would be good to just change or even remove the link and add "(archived)" for an archived package, in order to avoid error messages and make the status transparent to readers. I would opt for having a standardized automated e-mail sent at the same time. Individual task view maintainers could then decide how they want to handle such cases; I personally would go for waiting a few weeks, but others might make different choices.

Let's see what the CRAN team think.

zeileis commented 2 years ago

Re: CRAN archivals. I'm not here to defend the CRAN team. I'm just saying that their end of the problem is also not easy. They check almost 20,000 packages and have to deal with the corresponding maintainers. If in a small "positive selection" half of the packages are not revived, you can imagine how much work and how many non-response (or partially responsive) maintainers there are overall.

Re: Amount of work. Writing the code for what you propose will take me more time than handling all package archivals for all task views in a year. It would require a different workflow on my end and, more importantly, on CRAN as well. In case we send out e-mail, I would also expect that responding to follow-up e-mails will take more time than just editing the task view files at the moment. But as I said, I'll discuss it.

ugroempi commented 2 years ago

Re: CRAN archivals. I don't mean to criticise CRAN policy, which may very well be the best way to go, given the available person power, and CRAN are also approachable, if one contacts them for a reasonable extension of time.

FlorianSchwendinger commented 2 years ago

What about a simple approach, where the archived packages are added to a csv file (or sqlite) in the git repo, with name, archive date, and the git commit hash before the removal. So it would be possible to automatically create an issue containing the hash of the last version of the task view the package was listed if the package reappears.

hwborchers commented 2 years ago

Would it be possible to open an issue for every package that got archived, in each task view where it is mentioned? Then the maintainers have a hook and can discuss what to do, e.g. write to the author or simply forget it. My experience also is that (for the Optimization and NumericalMathematics Task Views) more than half of the archived packages do reappear.

zeileis commented 2 years ago

OK, I had some discussions with the CRAN team and a programming session with Kurs and I think we have a good solution now. You can try things out when you install the latest development version of the package from CRAN:

install.packages("ctv", repos="http://R-Forge.R-project.org")

When processing task views for CRAN we now obtain the list of archived CRAN packages and handle them by keeping the links in the info text (with comment: (archived)) but we exclude the packages from automatic installation. To see this in action, try

ctv2html("MyView.md", cran = TRUE)

where you replace "MyView.md" with a task view name. Ideally including some archived packages. (Unavailable packages are also handled similarly.) Also

check_ctv_packages("MyView.md")

now distinguishes between packages that are not available at all (e.g., when misspelling a package name) and a package that is currently archived on CRAN.

When we switch to this workflow, there is no need anymore to immediately update the task view files. Instead we will wait for a certain period (current idea: 4 weeks) before we notify the corresponding task view maintainer (plan: via an issue). Hopefully, the package has been unarchived by then. If not, the task view maintainers can take action themselves. If the problem still persists after an extended period (current idea: 6 weeks), i.e., the package is still archived and the maintainers have not changed their task view file, I will remove the package from the task view file and close the issue.

I hope this will be a big improvement because there will be no need for action for a certain proportion of archivals. And for the others there will be closed issues on GitHub that can be revisited later on.

ugroempi commented 2 years ago

Thanks Achim, I just tried it, and it works great!

Two suggestions:

zeileis commented 2 years ago

Thanks for checking, Ulrike, much appreciated! Regarding your suggestions:

ugroempi commented 2 years ago

I agree that intentionally keeping archived packages in a view for a long time should be the exception. In case of planor, I believe that keeping the package in the task view does not decrease pressure (having more places that highlight the need to fix something is rather more than less pressure, as far as I see it). There is a very busy maintainer who has no understanding of the more technical aspects of programming in the package, and the programmer is long gone. I think that keeping the package visible for the community in spite of its being archived makes sense. I might contact Hervé, whether he would be willing to hand over maintenance to someone else; if yes, I might try to help him to find someone.

zeileis commented 2 years ago

Re: planor. If he doesn't want to hand over maintenance, it might still be a good idea to put the package on a public repository (e.g., GitHub) and let others help with fixing the problems in the package. Then you could also link the package with github(...) in the task view until it reappears on CRAN.

ugroempi commented 2 years ago

Well, yes, but given his time budget and his (assumed) lack of experience with github, I doubt that this is going to happen.

zeileis commented 2 years ago

OK, I've released the new ctv version on CRAN now: https://CRAN.R-project.org/package=ctv

Based on this we will roll out the new workflow, hopefully in the next days. And then we can reassess later this year which improvements we can make in the workflow, specifically (but not only) with respect to handling archived packages.

Closing this issue now.