Closed cboettig closed 5 years ago
What is the set of accepted values for that key?
codemeta
isn't that normative, most fields are only constrained by type (e.g. URL vs text string), nodes of type id
or url
can also take other nodes as arguments. (In JSON-LD it is possible to create restricted vocabularies for properties, but not super common).
onboarding
isn't a recognized property in codemeta terms, and probably doesn't appear in any other common namespace either (e.g. it's not in schema.org or dublin core etc), so the natural thing to do is just declare that it's an "ropensci" term. JSON-LD is built to be easily extensible in this way, you would just add such terms explicitly to the "context", see example
Does that make sense?
Makes sense yes. I was just asking what you think the values are? e.g., in review
, accepted
? others?
ah right, I thinking we'd scrape the terms directly from the text on the badge, though it's not actually clear to me where that text is coming from.
me neither - @karthik what are the possible values for the badge?
right now, we don't have a way to automate retrieval of this info into the registry.json file, but could do manually for now- things don't change that quickly in onboarding :) but of course automated is better
@sckott Well I guess we can always scrape the text from the svg
source of the badge and use that, at least for the time being. (Looks like the term is Under Review
not in review
). I can add a little function to codemetar
for extracting badge status metadata.
me neither - @karthik what are the possible values for the badge?
Right now:
unknown
- Anything not yet under onboarding, or still under presubmission inquiry.
under review
- Anything from 1/editor check till 5/
peer reviewed
- Anything marked 6/approved
So for a pkg that is likely to never go through onboarding, use unknown
? or something else?
We need to have a discussion about that. For packages that have been developed by ropensci staff before onboarding, or have accepted as is, we should have a badge for that separate from unknown. Maybe they can be considered reviewed?
Good point. I don't think we should badge things as reviewed that haven't been through our onboarding peer review (e.g. bound to create confusion). Some packages developed before onboarding have subsequently gone through it (EML
, fishbase
-- speaking of which, how do I get onboarding badges for them? and can onboarding badges link to the onboarding review?), and others might. But possibly not all.
I'd suggest we badge them as pre-review
or something similar.
@sckott What would be the reasons for a package to never go through onboarding? I know we can't just swamp the review process with 100 staff-written packages, but seems there was some discussion of lulls and in onboarding and new reviewers asking why they hadn't been asked to review anything? Maybe we could slowly work those through. @noamross thoughts?
Maybe there are certain packages that aren't suitable for onboarding review? (e.g. I dunno, but I think of some of Jeroen's packages which are clearly professionally developed but I'd have a hard time constructively reviewing them myself when most of the work is about crazy system Makevar issues... maybe we have a category for staff reviewed
in that case? Or just onboard them anyway?
The review badges are automated and are tied to the review issue labels. They get "under review" at stage 2/seeking-reviewers
and "peer-reviewed" at 6/approved
. We went though and labeled the old reviews retroactively a while back so badge status should be for most (I note EML remains at 4/review-in-awaiting-changes
), except the few like fishbase that preceeded the onboarding repo.
It would be good if the peer review metadata includes the URL of the reviews. In our case, the onboarding thread, but maybe we want to establish a convention that others could use, too. Like peer_reviewed: yes; review_org: ropensci; review_url: the_url
.
I really don't think that we should have a "staff reviewed tag" for things that haven't gone through our process. "Developed by rOpenSci" and "rOpenSci Peer Reviewed" are two separate and complementary marks of quality but mean very different thngs. It's OK that we host both as long as its clear which is each or both. There are lots of packages out there that we would recommend even though they are not peer reviewed - they've gone though other validating processes like CRAN, or have trusted developers. It's fine that some of our internal work rests on those other validations (especially as some of it would actually fall out of our peer review scope). Maybe we want to put RO as an author in DESCRIPTION to clarify which are RO-developed packages.
For those packages that we develop that are in scope, I do think they would benefit from peer review and we could absorb them bit by bit. We do have people who are C/Makevars experts in our reviewer pool who I try to reserve for packages that benefit from it, just as we tap people with knowledge of S4, etc., for particular packages.
Thanks Noam. So does this mean there's some database I can query to get onboarding status instead of me scraping the text from the raw SVGs?
pre-review could be a nice internal tag but that sounds even more confusing if displayed on a repo. While it would be nice to send older stuff through as time permits, many packages have been well used and have already been vetted (like rgbif and such).
I love Noam's idea of creating quick issues for such packages and just accepting them via editorial review and adding 6/approved + legacy tags to them.
, how do I get onboarding badges for them? and can onboarding badges link to the onboarding review?
That is exactly what they do now. If you have an issue number, the badge is at ropensci.org/badges/issuenumber_status.svg
What would be the reasons for a package to never go through onboarding?
I don't think anyone is against it - it's just a matter of very large volume. So for in house pkgs, that'd be like 90 pkgs that would need to go through review that are on CRAN + those on github and not on cran yet
I know we can't just swamp the review process with 100 staff-written packages, but seems there was some discussion of lulls and in onboarding and new reviewers asking why they hadn't been asked to review anything?
yeah, there have been lulls - we could submit ours when there's down time
I note I only think we should create those legacy
issues for previous packages if they went through peer review similar to our current process. fishbase, for instance, was peer-reviewed but the reviews reside in the package repo.
I see. Perhaps we can also do something else to get all the older ones in via a fast track process?
Also it might be worth thinking more about review for things submitted to us versus what we do as our jobs. Stuff written by Scott, Jeroen, and our contractors like Kirill are top quality work and those could go through short, but frequent internal code review.
could go through short, but frequent internal code review.
good idea - examples of this being done in small teams (not like a small team within google where they can afford to have people that only do code review)?
I agree that it's top-quality work, and short/frequent reviews are good, but I really think that our peer review process is our own thing and a "RO Peer Reviewed" badge or designation should mean specifically that. Perhaps we should think of a badge or some other branding mechanism to convey "Developed by rOpenSci".
Thanks @noamross. I think you make a great point that we should restrict peer-reviewed badging to things that actually went through a peer-review process (e.g. particularly a review we can link to). I'm not sure if we need to distinguish between those where the review lives in an onboarding
issue vs a package issue ("legacy" vs "Peer Reviewed"), it's still basically the same peer review process for, say, rfishbase
review as the current ones?
But I agree it doesn't make sense to tag rOpenSci staff-developed packages as peer reviewed. Everyone's made a good point that we probably don't want to / can't review all of them. For these ones though it would be nice to have a different kind of badge to distinguish them, maybe indicate that they get an internal quality check or something. Maybe "Staff Review" to contrast to "Peer Review"?
I don't think having been on CRAN & our GitHub for a long time is a reason in-and-of-itself not to review these -- e.g. rfigshare
has been on CRAN for years but probably would have a hard time even onboarding without a bunch of maintenance now. I think there's still reasonable heterogeneity in these packages which would be worth identifying.
All good points! Agreed 💯
Reading this now was I opened a duplicate issue 🤦♀️ but hey good occasion to ressuscitate this thread. 👼
I think the registry should have the following tags
agree on 1 and 2. i don't think we mention yet to community contribs about a status badge, do we? for staff reviewed, do you mean like one staff reviews another staff's package?
Status would be sthg like in dev/stable/abandoned. We don't recommend it currently indeed. ☺
I only mentioned staff reviews because it had been discussed in the thread but yes that's what it would be.
I think the registry should have the following tags
onboarded vs. not
Great idea!
staff- vs community-contributed (no other case unless unconf packages are included cf #12 )
This is a very useful distinction to have, especially going forward. It will provide a nice overview of the ratio of work we produce as opposed to what we curate from the community. At some point that could even determine the nature of the organization (from software producer to being more of a hub of activity).
development status if we want to enforce having a status badge for all packages (but do we?) cf #10
Given the fragility of APIs, this is a good idea. No matter how strong our code, a tweak in an API or lack in support from a data provider could quickly tank everything.
not sure if a staff-reviewed badge makes sense, unless such reviews start happening?
I don’t understand what staff reviewed means. rOpenSci folks don’t review each other’s code.
I really only included "staff-reviewed" because it came up (as an idea) in the thread above :joy:
So fields to be added are
onboarded vs. not
staff- vs community-contributed
development status (I'll try to find the best place to have the discussion about where/how to enforce/strongly suggest status badges, depending on https://github.com/ropensci/onboarding-meta/issues/26)
See also https://github.com/ropensci/codemetar/issues/45 and https://github.com/ropensci/codemetar/issues/23 regarding where the info about onboarding/reviews will live.
https://github.com/ropensci/onboarding-meta/issues/9 "I'm fine with non-peer-reviewed but mature (say, CRAN-submitted) internal RO packages moving from labs to main as long as the badge (and maybe in the future, codemeta.json) distinguishes the two."
codemetar::create_codemeta
will now include information about the review if there's an onboarding review badge in the README of the package and if that issue is closed. Cf https://github.com/ropensci/codemetar/issues/23 For now the info included is basic, but it'll be extended to add editor(s) and reviewer(s).
sounds good
Closing this.
We need more data from the badges API, but that is discussed in https://github.com/ropensci/codemetar/issues/23
Right now codemetar
creates a review field with the onboarding URL if there's a badge and the issue is closed.
The new version of the packages page on the website will have a comments icon with a link to the onboarding URL. https://github.com/ropensci/roweb2/pull/308
Oh and reg development status, in the new packages page it'll be either deduced from a repostatus.org badge, or, if there's no badge, from the organization (ropenscilabs -> concept, other -> active).
At Maëlle's request, there is this now: https://badges.ropensci.org/json/onboarded.json But it is also only for onboarded packages. Not current status. Current status is also label dependent, which is problematic with our workflow (if an editor does not switch the label even if a particular phase of the review has passed, there wouldn't be an automatic change).
so how do you determine pending vs not @karthik ?
same way the badges are done. See this comment: https://github.com/ropensci/roregistry/issues/9#issuecomment-316531715
it would be great to be able to check onboarding status info in the registry. There's not a really obvious term for this in
codemeta
terms, there isdevelopmentStatus
but that's described as being a https://repostatus.org term.We could just add it as an additional,
ropensci
specific property, e.g.