apache / www-site

The ASF Website
Apache License 2.0
42 stars 99 forks source link

Inaccurate project count #307

Closed sebbASF closed 10 months ago

sebbASF commented 10 months ago

https://github.com/apache/www-site/blob/78317a9ab4ca267ef948c25b73c373920600963c/content/dev/project-requirements.md?plain=1#L8

At present there are 209 PMCs and 24 podlings (i.e. total 233). This is far short of the 350 mentioned in the text.

bproffitt commented 10 months ago

According to the annual report, which is currently being reviewed by infra for accuracy:

When the number is confirmed, we will make any needed adjustments accordingly, both in the annual report and on this site.

sebbASF commented 10 months ago

The issue here is what does a software project community mean? I read it as meaning the PMC or podling.

The Commons developers maintain many distinct components, but it would be misleading to say that there are distinct communities for each.

clr-apache commented 10 months ago

Some PMCs have quite distinct projects. For example, DB has JDO, Torque, and Derby.

On Aug 25, 2023, at 16:40, sebbASF @.***> wrote:

The issue here is what does a software project community mean? I read it as meaning the PMC or podling.

The Commons developers maintain many distinct components, but it would be misleading to say that there are distinct communities for each.

— Reply to this email directly, view it on GitHub https://github.com/apache/www-site/issues/307#issuecomment-1694035905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4M6RCYEGG4IPA3B4HXTLDXXEZWZANCNFSM6AAAAAA35YHO4Q. You are receiving this because you are subscribed to this thread.

Craig L Russell @.***

hboutemy commented 10 months ago

see https://projects.apache.org/projects.html?committee for what is counted (= what projects have been declared by PMCs)

sebbASF commented 10 months ago

Some PMCs have quite distinct projects. For example, DB has JDO, Torque, and Derby.

Yes, but are these distinct communities?

sebbASF commented 10 months ago

see https://projects.apache.org/projects.html?committee for what is counted (= what projects have been declared by PMCs)

That site is optional; not all projects have DOAPs and once created, may not be maintained.

e.g. there seems to be no DOAP for JDO

Also it looks like the algorithm for counting projects has some flaws.

AFAICT it counts the number of DOAPs that don't have the same name as the parent PMC and adds that to the number of PMCs + PPMCs. This can result in some double counting, because there may be no project corresponding to the bare name of the PMC (e.g. httpcomponents). It may also include Attic projects (I need to check that).

sebbASF commented 10 months ago

It does appear to include Attic projects.

hboutemy commented 10 months ago

That site is optional; not all projects have DOAPs and once created, may not be maintained. e.g. there seems to be no DOAP for JDO

yes, what do each PMC consider as a "project" is not clear. And did they do the job to declare it to have it officially counted is another step. We're having today again a discussion that we had 5 years ago, when we built the new projects.apache.org site: I tried to document the logic as much as possible in the "About" page, but it always has been hard to define how to communicate to the PMCs to update their data

Also it looks like the algorithm for counting projects has some flaws. It does appear to include Attic projects.

It may be true: don't hesitate to update the algorithms

clr-apache commented 10 months ago

On Aug 26, 2023, at 00:57, sebbASF @.***> wrote:

Some PMCs have quite distinct projects. For example, DB has JDO, Torque, and Derby.

Yes, but are these distinct communities?

FWIW, yes, these are distinct communities with their own user/dev lists, releases, and web sites. DB was established way back when umbrella PMCs were common. Now, not so much. — Reply to this email directly, view it on GitHub https://github.com/apache/www-site/issues/307#issuecomment-1694218886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4M6REHOZRM6J7KT3BXY7LXXGT53ANCNFSM6AAAAAA35YHO4Q. You are receiving this because you commented.

Craig L Russell @.***

sebbASF commented 10 months ago

That may be true for DB, but httpcomponents for example has 3 projects which are definitely managed by the same community and Commons has lots of components all managed by the same community.

Is it possible to distinguish which PMCs have multiple sub-communities and which don't? [There may be some PMCs that have sub-communities that work on more than 1 project.]

Unless it is possible to distinguish these, it is not possible to be produce the figures for such a concept. If we cannot justify the numbers, we should not publish them.

sebbASF commented 10 months ago

It may be true: don't hesitate to update the algorithms

I have made some changes to the calculations. Attic projects are no longer counted, and there should be no more double-counting of projects.

This has reduced the count of PMC projects to 294 (from 366), and the total initiatives to 300+

The actual count of PMC projects may be somewhat higher due to missing DOAPs, but I doubt there are as many as 70 missing (and some DOAPs may related to retired projects).

bproffitt commented 10 months ago

I disagree with the DOAP method of counting. M&P has coordinated with Infra in the past to get the 366 number, and we will continue to work with them to come up with this stat. And anyway, based on conversations I have had with people who have assisted projects with missing DOAP files, they have no problem believing there are over 70 missing DOAP files.

M&P will keep with its current methodology of counting. Closing.

sebbASF commented 10 months ago

I agree that counting with DOAPs is inaccurate, but that is what I understood was being used for the count of over 350.

Have Infra provided information on how the count is being calculated?

But that is not my only concern: there may well be 350+ project initiatives, but are these really all distinct software communities? What is meant by a software community if it is not a PMC?

rbowen commented 10 months ago

If I might get a little pedantic, Charles Vogl says[1] that a community is a group of people with a shared goal. What makes a community "distinct" is that goal, not necessarily the individuals comprising the group. So, is Apache Commons really 43 "distinct communities"? Well, yeah, sort of. It depends on how you look at it.

FWIW, looking at https://projects.apache.org/projects.html?committee I count 209 committees and 377 projects, with 55 in the attic. That would leave 28 or so projects that, presumably, have missing DOAP files, which I find completely plausible.

[1] https://www.amazon.com/Art-Community-Seven-Principles-Belonging/dp/1626568413

sebbASF commented 10 months ago

It depends on how a goal is defined. Does Commons have goals beyond maintaining its components?

The issue I have is that the term is not clearly defined, and I have yet to see any details of how the number is calculated.