Open noamross opened 4 years ago
thanks for opening this @noamross
@maelle what, if any of these fields, would/could be integrated into codemetar? Or do we do all of these outside of the context of codemetar? And do the below notes make sense to you?
Some notes on the fields (checked box means I think we have all necessary info already):
staff
: if we don't have already, maintain a text file with staff names - then flag as true
if maintainer is in that list, false
otherwisepeer-reviewed
: we already have this information in the onboarding
field in registry.jsonarchived
: can probably automate detecting this if we are pulling info from all ropensci orgs (we should document this to make sure everyone knows this does not mean CRAN archived; although if its archived here, most likely it's CRAN archived)archived-date
: ideally we automate this - could be the first time it's detected in the ropenesci-archive
github orgarchived-to
: should be easy to use the new url, e.g., https://github.com/ropensci-archive/foobar
incubator
: simply flag as true
if in ropenscilabs, false
otherwise, can automate easilyTo me it seems these are fields that should be handled outside of codemetar.
Some further comments:
staff: if we don't have already, maintain a text file with staff names - then flag as true if maintainer is in that list, false otherwise
It might be a bit more complicated depending on when staff members developed the package. :wink: But yeah the staff list, that you can retrieve from https://github.com/ropensci/roweb2/blob/master/data/team/team.json, is probably a good approximation.
Reg archival date, if I remember correctly, I looked into this when Noam asked a question about it in the curation policy, and couldn't find any way to retrieve the information via GitHub API so yeah it'd be good to collect the date ourselves.
thanks @maelle
archived: can probably automate detecting this if we are pulling info from all ropensci orgs (we should document this to make sure everyone knows this does not mean CRAN archived; although if its archived here, most likely it's CRAN archived)
We might actually want 3 archived fields
right, github_archived
and cran_archived
already done - although github_archived
needs some fixing as it doesn't account for repos transferred to ropensci-archive
yet. I would think for github_archived
it would be true
when a repo is in ropensci or ropenscilabs and is archived OR is true
if its in ropensci-archive
(and is github archived or not, doesn't matter) - otherwise github_archived
would be false
What's important for me is having a single variable that'd determine whether the packages is shown in an archived tab on the website packages pages.
One case I can't remember: if an ropenscilabs repo isn't transferred to its maintainer account, where does it end up?
if an ropenscilabs repo isn't transferred to its maintainer account, where does it end up?
i don't know 🤷
@noamross what do you think?
re-reading the beginning of this thread, reg "Identify packages that are neither staff maintained nor peer reviewed, to return to author GitHub namespaces" how would we even track a repo that's been transferred back to their author org? We'd need to store the URL and status here?
Maybe if authors opt to have the repo transferred to their own account they can't expect us to keep listing it on the website? I am worried to introduce more complexity to the build systems for a few exceptions that don't even exist yet.
i guess we just add a url to this text file https://github.com/ropensci-org/makeregistry/blob/master/inst/automation/not_transferred.txt right?
But we'd need to record status in case authors don't add the badge
true - is there a default status we could use for all repos in those cases? just use Active?
well that's what we do now, but if we start letting authors have their repos move to their account instead of ropensci-archive then we need to track the status.
Can we make sure they have a repostatus badge before transferring the repo?
can you add this to the dev guide PR?
I assume you mean the PR that you just merged?
yeah, I added that so my request was outdated, sorry
isn't this already done here -> https://github.com/ropensci/dev_guide/blob/dev/maintenance_curation.Rmd#L201-L202
what's still missing here is "archived-date". @noamross do you remember why it was needed?
:wave: @noamross
I can not recall a specific reason that we would use this but I think such a record is a good one for future (un)anticipated analyses (e.g. "how long do packages go before archiving?").
Now I realize I have not stored archival dates so I'll try and explore GitHub API :grimacing:
Here are fields we need in the registry to implement the Package Curation policy
Boolean or tag fields:
staff
for staff-maintained packagespeer-reviewed
for peer-reviewed packagesarchived
for packages (will be moved to newropensci-archive
namespace)incubator
for packages inropenscilabs
(to be renamedropensci-incubator
, I guess we could also just stick withlabs
. Either will just need documentation everywhere).Also:
archived-date
- date for archived packagesarchived-to
- URL of repo for where archived packages are moved toWe will use this to: