rociojoo / CranTaskView-Track

Cran task view project for tracking data
3 stars 0 forks source link

v22.01 final update: Tracking.md #22

Closed basille closed 2 years ago

basille commented 2 years ago

We need to make sure there is a perfect correspondence between the complete listing of tracking packages that pass CRAN checks: Check that every package is listed and described, with the correct tag (pkg, including with priority = "core", github, etc.).

There are 54 packages that check. From those listed in the NEWS file (I didn't check the complete listing):

rociojoo commented 2 years ago

Mathieu @basille , I will deal with this in April, unless you can do it earlier. Cheers.

zeileis commented 2 years ago

I have removed the feedr package now as this needs to be addressed before April. Also, I was wondering why you are handling the issue above here rather than in https://github.com/cran-task-views/Tracking/?

basille commented 2 years ago

@rociojoo: I'll do my best to finalize it this week. I'll keep you posted.

@zeileis: A matter of habits I guess. I don't think I've entirely resolved how to work with the new official repository yet. Can we simply and completely transfer how we used to work with our private repo there? If the answer is yes, then there's no need to keep using this one (but I'd keep it for archiving reasons).

basille commented 2 years ago

Also, can we "migrate" our issues from the previous to the new repository? There are a bunch of them for specific packages already there, and I'd rather not go back and forth between the two repos.

zeileis commented 2 years ago

Migrating the two open issues is no problem. Whether it makes sense to migrate the closed issues (if that is what you were implying), I cannot tell how useful this would be. But I guess that it would be possible if you thought it would be useful.

Beyond the issues I also cannot say whether you can "completely" transfer the way you used to work with your old repository. I didn't go through the details. But from our discussions so far I would suspect that you can transfer most things but that adjustments in the workflow will be necessary. For example, the official repositories encourage frequent small changes (which will be published on CRAN quickly) - as opposed to your approach with large and very thorough updates after longer time periods.

But I think it would be in everyone's interest, especially also for contributions from users/readers, to have everything tracked in one place. In fact, this was the main motivation for migrating from R-Forge to GitHub.

basille commented 2 years ago

Thanks @zeileis for your answer. I totally agree with you on the rationale to have everything on the official CTV repository — it does make a lot of sense to me.

And yes, I implied all issues, including those that are closed. As a matter of fact, we are considering having one issue per package, as to have a single place of discussion for everything related: that would account for submission, decision, archival, and any update about their status in the Tracking CTV. That means an issue can be closed, and later reopened because something changed with the package.

Two remaining questions:

zeileis commented 2 years ago

If you, as the co-maintainers, want to keep one issue per package, that's up to you. However, users and contributors may not stick to this and I don't think it's worth trying to force them to. We're trying to be as open as we can regarding potential contributions, allowing PRs, issues, and e-mails. Given that the volume of such contributions is not overwhelming I would be good to make them as easy as possible.

For migrating the issues: Rocîo is a co-owner of both repositories, so I think she should have all the necessary rights to transfer the old issues.

Finally regarding rolling releases: The structure of the task views is modular enough so that I don't see any important reason why you should keep updates from being released on CRAN for months. When you have reviewed a package and decided it should enter the task view, this information should be available to users. Why hide it?

basille commented 2 years ago

If you, as the co-maintainers, want to keep one issue per package, that's up to you. However, users and contributors may not stick to this and I don't think it's worth trying to force them to. We're trying to be as open as we can regarding potential contributions, allowing PRs, issues, and e-mails. Given that the volume of such contributions is not overwhelming I would be good to make them as easy as possible.

Yes, we certainly don't want to make it harder to contribute — quite the contrary. This approach of one issue per package is mostly for the developer (actually maintainer) of the package. We already have an issue template for submitting a new package, that's also a starting point for users who may not be author of a package, but know of it and would like it be considered for the CTV. Everyone is more than welcome to contribute, and we're trying to make sure of this.

For migrating the issues: Rocîo is a co-owner of both repositories, so I think she should have all the necessary rights to transfer the old issues.

Ah good! I'll let Rocío check this out and move the issues when she has time (this is not the most pressing issue).

Finally regarding rolling releases: The structure of the task views is modular enough so that I don't see any important reason why you should keep updates from being released on CRAN for months. When you have reviewed a package and decided it should enter the task view, this information should be available to users. Why hide it?

Mostly because we have an entire verification process for all packages (CRAN and non-CRAN alike): are they still available, and do they check CRAN tests (for non-CRAN packages)? At the moment, we're running this manually (on our own computers), but we're considering moving it to GitHub Actions, which would allow for a rolling release without much effort. I think we all agree overall about the benefits of releasing as often as possible, and that's a goal of ours when we're able to automate the whole process.

Check for instance our NEWS file, which shows what we're able to do with it. We also have a bunch of non-CRAN packages (mostly GitHub, but also Bioconductor and R-Forge) that do not check CRAN tests, as well as some archived CRAN packages — all of them are not included in the most recent CTV version, but they are nevertheless verified every time, in case they're now good to go. So it's not about newly added packages, but really about the whole process behind the scene.

zeileis commented 2 years ago

That's all fine but I still do not understand why this should keep packages from being added in between these major checks.

  1. When prompted by PR/issue/mail, you check a newly proposed package. If suitable, you can add it to the task view right away. The same applies if you, the co-maintainers, discover a new package that seems relevant.
  2. When prompted by CRAN, archived packages are excluded from the task view (typically by me).
  3. From time to time (e.g., twice a year), you run your thorough checks of everything.

Item 2 needs to happen for CRAN. And I see no reason why 1 could only be done together with 3.

basille commented 2 years ago

These are very good points, and I'd be very interested to have a rolling release with our current process. However, item 1 is easy for CRAN packages (they obviously do check CRAN tests), but not necessarily so for GitHub and other non-CRAN packages… Let's see what @rociojoo thinks about this, but I have a feeling that running some code for a single submitted package would be almost as demanding as running the entire code for everything. And I'm not sure I'd want to prioritize CRAN packages simply because they're easy to process. If we can automate the whole thing through GitHub Actions for instance, that would completely solve the issue.

Thanks @zeileis for your feedback!

zeileis commented 2 years ago

The main argument for prioritizing CRAN packages is that it is a CRAN task view! So I see no problem with that.

Also, it would set really weird incentives if we held CRAN packages back just because GitHub packages are harder to process.

basille commented 2 years ago

Ah, touché! 🙂 Very good argument indeed, and, thinking twice, I like the idea of incentivizing CRAN packages (we have way too many packages that stay on GitHub and would be perfectly good CRAN packages…).

So we could have:

Last question @zeileis, how will the CTV be released from now on? Will an update of the Markdown file trigger the release? Or will it require manual intervention?

@rociojoo, before we agree on this, what do you think?

zeileis commented 2 years ago

The plan is to have a daily cronjob that grabs all task view .md files and updates the pages on CRAN.

basille commented 2 years ago

Ooooh good! That would be excellent! Definitely something to keep in mind.

rociojoo commented 2 years ago

@basille I like the idea. Giving priority to the pakages on CRAN definitely makes sense.

basille commented 2 years ago

Great! Let's do it that way. @rociojoo, when you have a chance, could you look at migrating issues to the official CTV repo, so we can stop using this repo here? (that means migrating all open issues + specific packages issues, included those that are closed… that might as well be as easy to migrate all issues at once)

basille commented 2 years ago

OK, full check: From the 54 packages that check CRAN tests, and are on the list of packages, those two are not described in Tracking.md:

I didn't have time yet to follow up on these, but that's all that's left for this round.

basille commented 2 years ago

OK, I just did a quick edit to add these two (778d4579edcdc4bb2d34c3000ed4ae411c754a7a). Ready to go, I'm closing this issue.