heliophysicsPy / standards

3 stars 12 forks source link

PHEP 4: PyHC Package Tiering #31

Open jibarnum opened 2 months ago

jibarnum commented 2 months ago

This PR proposes a new process PHEP to the PyHC. PHEP 4 establishes a new tiering structure to PyHC projects, which will automatically affect PyHC packages once it goes into effect. Included herein is information on requirements for each of the new four tiers of PyHC projects (Gold, Silver, Bronze, and Bronze), as well as benefits accrued at each tier.

jibarnum commented 2 months ago

@jameswilburlewis @aburrell @rweigel @sandyfreelance @darrendezeeuw can't add you all as reviewers (I think I need to invite you to the PyHC org on GitHub first. But for your awareness and comments.

sapols commented 2 months ago

Initial thoughts/issues:

aburrell commented 2 months ago

@sapols "copper" was my suggestion, to keep with the medal terminology. It's the next medal after "bronze".

rebeccaringuette commented 1 month ago

I would like to echo Shawn's comments on this PHEP being a great step forward for PyHC. Some comments:

jibarnum commented 1 month ago

@sapols thanks for your thoughts.

Is it a typo that the first table ends with "Copper" instead of "Honorable Mention"?

No, as @aburrell pointed out, that was changed to keep with the "medal" terminology we used for the other categories.

It'd be helpful to add a hyperlink to the PyHC env to clarify which env we mean. Probably even a specific Docker image for extra clarity? Although (and maybe this is a bigger question) do we need a new "PyHC env" to facilitate this? The purpose of the current one is to hold all PyHC packages, whereas this PHEP specifies only Gold-tier packages get inclusion in the env. (Which also begs the question how will packages know if they're compatible with the env if they're not included in it?)

I think we want to establish some specific environments for this. @rebeccaringuette had the interesting suggestion in her comment (below yours) re creating two environments. I think some kind of split of Gold + Silver and then Gold + Silver + Bronze for PyHC-top-tier and PyHC-all environments, respectively (happy for some help in workshopping that terminology).

Question: how will this affect "core" package status? Will "core" packages still exist, or does Gold-tier become the new "core"?

I think this would make core go away, yes, leaving us the highest level being "Gold". It'd get confusing in my mind to delineate the differences between Gold and core. Further, we've always struggled to say what exactly it meant to be a core package, or how to become core package (apart from a nod of approval from current leadership and core package maintainers).

jibarnum commented 1 month ago

@rebeccaringuette thanks for your thoughts above!

Agree that benefit items like Python env inclusion and chat bot inclusion should only be available to...

Indeed, I'm trying to make that a bit more clear in the soon-to-come commit.

We need two versions of the PyHC environment to avoid creating an environment so large that no one wants to wait for it to install/load...

I like this thought. I'll include it. However, I do wonder how we intend to include the bronze categories, which allow some major conflicts to exist with installation into the software environment... thoughts?

agreed that standards compliance assistance should be available to all upon request

I mostly agree. I think if you're already at Gold, you probably will only get assistance if you're in danger of dropping down a level.

also hesitant about including interoperability status...

Yeah, I nixed that one. The metadata suggestion is good, though can you elaborate on how we would evaluate that?

also agree on including the PyHC env installation

Yep.

package DOI should be for the software repository,

Sure, that makes sense.

PyHC standard grades should not be determined by self-evaluation...

Indeed, and thus the point of doing a pyOpenSci review process. But the self-evaluation is just step one to getting there. Shawn does also do a general review to make sure the grades are commensurate with the state of a repository.

need to specify the current PyHC env...

Sure.

like the idea of the term 'core packages'...

Same, I'm nixing that once (if) this PHEP goes into place.

need to make some funding available for packages

For sure. First we need to get a good definition on what we want for PyHC-specific requirements for a pyOpenSci process to show we have the process in place and ready to go for packages.

Need to state a time frame for packages to submit the tier they best align with

For sure, I need to include some wording on this. I don't want to wait too long, so perhaps 6 months is best. I'll find out soon if that's a terrible idea by how many tomatoes are thrown my way with the next commit. :)

jibarnum commented 1 month ago

Alright, all. Tried to catch and incorporate as many comments as I could. Please review and let me know what concerns/suggestions I didn't capture or have come up with the changes. Thanks!

jibarnum commented 3 weeks ago

A note that this will supersede the existing project submission process is probably helpful. Also potentially a list of differences:

Self evaluation is now just starting point, TSC evaluation is required Additional requirements beyond the main PyHC standards Plus, of course, a commitment to update the submission process.

Sure that makes sense @jtniehof

jtniehof commented 2 weeks ago

Should there be an explicit closes #30 on this?

rstoneback commented 2 weeks ago

NASA funding is already requiring that software proposals satisfy PyHC standards. Did NASA check with us before adding that to funding announcements? Does applying PyHC standards in funding announcements comport with APA standards and U.S. agency rule making? What standards level is going to apply to NASA funding? Gold, silver, bronze, or copper?

Incidentally, my interest level in providing free labor to NASA, in the form of standards or otherwise, is quite low.

rebeccaringuette commented 2 weeks ago

That is a discussion to have with HDRL and NASA HQ once this gets settled. My initial thoughts are to require bronze as a minimum for software packages starting out. This sets the bar low, but still requires basic FAIR (e.g. DOI, license, pip for reusability, PyHC env for interoperability, and similar). Proposals from a bronze package (or copper) could alternatively ask for funds to improve the level to silver or gold in a detailed manner, e.g. the pyOpenSci review process.

rebeccaringuette commented 2 weeks ago

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

rstoneback commented 2 weeks ago

That is a discussion to have with HDRL and NASA HQ once this gets settled.

I disagree. If NASA wants to set the standards then they should set the standard. It should also be applied to not just to heliophysics, but to Earth, Planetary, and Astrophysics divisions. If NASA wants to use the PyHC standards then PyHC sets the standard, not NASA. I will repeat however that I think it is inappropriate for NASA to use the results of unfunded labor.

rebeccaringuette commented 2 weeks ago

That is a discussion to have with HDRL and NASA HQ once this gets settled.

I disagree. If NASA wants to set the standards then they should set the standard. It should also be applied to not just to heliophysics, but to Earth, Planetary, and Astrophysics divisions. If NASA wants to use the PyHC standards then PyHC sets the standard, not NASA. I will repeat however that I think it is inappropriate for NASA to use the results of unfunded labor.

The PyHC standards apply only to software relevant to Heliophysics and written in or run from Python, nothing more, and cannot be applied across NASA's divisions or even other software in Heliophysics. concerning the funding comment, the PyHC standards are mentioned as conditions on NASA funding opportunities, particularly the HTM call, so the requirement is not unfunded. I don't recall at the moment if it is mentioned on other calls. Since PyHC is now moving to tiered standards, the conversation between PyHC leadership, HDRL leadership and NASA HQ will likely be which tier to set as a minimum standard for an updated version of those funding calls, assuming that HQ decides to change the wording of that AO and others at all. The decision of which tier a given proposal chooses to adhere to (and how they intend to adhere to it) may instead be left to the decision of the proposal submitter, which would then be left to the scrutiny of the proposal reviewers.

jibarnum commented 2 weeks ago

Should there be an explicit closes #30 on this?

Yes, I'd say so!

rebeccaringuette commented 2 weeks ago

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

@jibarnum

jibarnum commented 2 weeks ago

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

Sure. I just went with what you'd said earlier for each level. I can update. I feel the HSSI metadata schema will require a url. Do we have one at the moment?

jibarnum commented 2 weeks ago

@rstoneback since HTM calls often closely align with the PyHC, and to the end of not siloing efforts, NASA made the choice to include our standards in their calls (to my knowledge, this is just for HTM). I was asked about wording for this, and provided what is shown therein. NASA could, in theory, go off and write their own things, but I suppose why reinvent the wheel if not necessary?

Like @rebeccaringuette it will require some discussion with NASA on if they want to update AO calls to match the new process we have, and if so, to what level. I'm not convinced it's appropriate to define here which level NASA funding calls will ascribe to. That's outside the scope of this PHEP, and wrong for us to levy that requirement on NASA since we're... not NASA.

I empathize with the funding concerns. The HTM call, albeit small at the moment, does have room for package maintenance funding requests. I strongly believe updating to better align with new PyHC tiering/PHEPs for standards would be a legitimate funding request. If enough packages are submitting those kinds of requests, that may even encourage NASA to start putting more money behind that (crosses fingers).

rebeccaringuette commented 1 week ago

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

Sure. I just went with what you'd said earlier for each level. I can update. I feel the HSSI metadata schema will require a url. Do we have one at the moment?

No, and likely not for a few months. We will need some tech support before that is available.

rebeccaringuette commented 1 week ago

What is this group's opinion on shifting the conda installation requirement to the silver level? It would simplify installation in the PyHC environment, especially on Heliocloud, but would such a requirement at the silver level too formidable of a hurdle so that it should only be at the gold level, or a simple enough task to include at the silver level? Note that pip installation is required at the bronze level.

nabobalis commented 1 week ago

What is this group's opinion on shifting the conda installation requirement to the silver level? It would simplify installation in the PyHC environment, especially on Heliocloud, but would such a requirement at the silver level too formidable of a hurdle so that it should only be at the gold level, or a simple enough task to include at the silver level? Note that pip installation is required at the bronze level.

For me, this should be at the bronze level.

sapols commented 1 week ago

I'll note that I intend to submit a proposal to hire a student developer whose sole job (at first) is to help PyHC packages join conda. No promises on how soon that could happen though, of course. I could buy conda installation being a silver-level thing if enough devs agree, but bronze is too low (as much as I'd love to do that, bronze just isn't realistic).

nabobalis commented 1 week ago

Unless you have compiled code, creating a conda forge recipe is no more difficult than setting up the python packaging required to get on pypi.

So for me, it should be at the same level as pip

jibarnum commented 1 week ago

Unless you have compiled code, creating a conda forge recipe is no more difficult than setting up the python packaging required to get on pypi.

So for me, it should be at the same level as pip

There are a few PyHC core packages not yet on conda (e.g. SpacePy IIRC @jtniehof ). It'd be good to hear from them on what the blockers are before deciding to relax the requirement down to silver or bronze.

rebeccaringuette commented 1 week ago

Thanks for the comments, Nabil. Absolutely, Julie. Would like to hear comments on this from others too. @jibarnum I have added this PHEP as a suggested unconference topic for the fall meeting, but it will need some structure to the discussion or it will be all over the place.

On Thu, Sep 19, 2024 at 4:16 PM Julie Barnum @.***> wrote:

Unless you have compiled code, creating a conda forge recipe is no more difficult than setting up the python packaging required to get on pypi.

So for me, it should be at the same level as pip

There are a few PyHC core packages not yet on conda (e.g. SpacePy IIRC @jtniehof https://github.com/jtniehof ). It'd be good to hear from them on what the blockers are before deciding to relax the requirement down to silver or bronze.

— Reply to this email directly, view it on GitHub https://github.com/heliophysicsPy/standards/pull/31#issuecomment-2362095736, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALX7QXRBF6S4RDSTPM74GILZXMWIHAVCNFSM6AAAAABKTRZAKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRSGA4TKNZTGY . You are receiving this because you were mentioned.Message ID: @.***>

nabobalis commented 1 week ago

Maybe if a package is pure python, it should be bronze, but more complex packages we bump that to silver?

But that might be too in the weeds for a rule or requirement.