pangeo-data / governance

Governance Documents for Pangeo
Creative Commons Attribution 4.0 International
3 stars 10 forks source link

Governance document is out of line with Mission Statement #23

Open mcgibbon opened 6 years ago

mcgibbon commented 6 years ago

The Governance document begins with:

The Pangeo Project (The Project) is an open source software project. The goal of The Project is to develop open source software and related technology for the analysis of large scientific datasets. The Project endeavors to extend the broader scientific software ecosystem.

The Mission statement is:

Our mission is to cultivate an ecosystem in which the next generation of open-source analysis tools for ocean, atmosphere and climate science can be developed, distributed, and sustained. These tools must be scalable in order to meet the current and future challenges of big data, and these solutions should leverage the existing expertise outside of the geoscience community.

This is further specified with three goals:

  1. Foster collaboration around the open source scientific python ecosystem for ocean / atmosphere / land / climate science.
  2. Support the development with domain-specific geoscience packages.
  3. Improve scalability of these tools to handle petabyte-scale datasets on HPC and cloud platforms.

The Governance document description of The Project does not include what the first goal states and the mission statement alludes to - fostering collaboration among scientists around the software ecosystem. This includes distributing and spreading the word about tools, getting users of tools to give feedback to developers, and networking scientists who can collaborate on tools.

The Governance document description of The Project also adds in the specification that the project is solely "for the analysis of large scientific datasets".

For example, holding a conference to network scientists working on various open-source scientific projects for collaboration and brainstorming of new projects would clearly fall under the Mission Statement, but not necessarily under the Governance document description. Under the Governance document description, you'd instead expect a conference of scientists specifically working on Pangeo projects to meet to work on those Pangeo projects (which are for analyzing large scientific datasets).

Here I have to take a detour to explain why these differences matter to me.

The initial meeting of The Project networked myself with @JoyMonteiro, and we spent much of that meeting and the following months developing Sympl and CliMT, ostensibly as Pangeo-affiliated projects. In the two years that followed, my impression was that the group as an online entity had become defunct (there was no memo that everything was moved to Github). As a result, Sympl and CliMT have grown apart from Pangeo.

Those projects were made with Pangeo in mind, in the following ways:

Notice that the above goals of our projects have nothing to do with analyzing large datasets.

Coming back from that detour.

Ideally now that I know Pangeo is still here, I'd like to bring Sympl (and with @JoyMonteiro's blessing, CliMT) back into the Pangeo fold. However, that brings us back to the conflict between the Mission Statement (which reflects the original intention of The Project), and the Governance document (which, recently drafted, reflects at least someone's current understanding of what The Project is supposed to be).

Should the Governance document be revised to reflect the original intention of The Project, or should the Mission Statement be updated to reflect a newer state of The Project? This is related to the question, do Sympl and CliMT have a place in Pangeo?

rabernat commented 6 years ago

do Sympl and CliMT have a place in Pangeo?

Absolutely, yes! We should broaden the statement in the governance document to encompass these efforts. I think "interoperability" trumps any other specific focus.

The NSF Earthcube award obviously steered things in a certain direction. Having specific deliverables for which we are accountable to NSF to produce has made us very focused. Now is a great time to zoom out and look at the broader landscape.

In the two years that followed, my impression was that the group as an online entity had become defunct (there was no memo that everything was moved to Github)

Jeremy, I don't feel that this is fair. You can find the memo right here: https://groups.google.com/forum/#!topic/pangeo/pFKILby3cuI In this email, I said:

We plan to conduct all of our work via the pangeo-data GitHub organization. In particular, we have a new “pangeo-discussion” repo we are using just as an issue tracker and wiki: https://github.com/pangeo-data/pangeo-discussion Please join in these discussions freely!

mcgibbon commented 6 years ago

Good to hear! With your blessing I'll work on a PR to broaden the opening statement in the coming week, unless someone else wants to take that responsibility.

I'll also think about how to approach re-integrating Sympl and CliMT into The Project.

You're right @rabernat, I was somewhat misinformed. I missed that memo. In my defense, it was really easy to miss and to misunderstand! That e-mail (subject "Announcing the Pangeo NSF Earthcube Award!") was clearly about something entirely different than deprecating the mailing list. This was only mentioned in the second-to-last paragraph of a reasonably long announcement, and it's not clear that "we have a new discussion repo" means "important announcements will no longer be posted to the mailing list". That very discussion repo says:

For now, community discussion is happening on the GitHub issues page or on the pangeo google group.

(emphasis my own)

We're missing the point though - there is nobody to blame for the mailing list becoming defunct, and I am not laying any blame (I'm certainly not blaming you!). I am simply saying that this is why I became out of touch with what was going on in the project.

JoyMonteiro commented 6 years ago

Yes, it would definitely be great to include climt along with the other tools that use xarray. I was unsure whether and how sympl/climt would fit into the pangeo ecosystem, but thinking about it in terms of interoperability as @rabernat mentioned makes a lot of sense!

mcgibbon commented 6 years ago

While doing this modification, I'm also noticing that the explanation of Contributors in the governance document is a little programming-focused. I'm going to include some edits in the PR to bring it more in line with the idea that community building, education, discussion, outreach, etc. are important ways to be a Contributor (pending your comments on those edits, of course).