knative / community

Knative governance and community material.
https://knative.dev/community
Other
244 stars 233 forks source link

Migrate community stats away from teststats.cncf.io #309

Closed bsnchan closed 2 years ago

bsnchan commented 3 years ago

The Knative election process currently points to teststats.cncf.io to determine voter eligibility for both TOC and Steering elections.

@jberkus mentioned to us that teststats is a staging area for devstats, and that there is no guarantee the Knative stats won't vanish tomorrow. The only reason why the statistics are still there is because the admin team has not done a redeploy recently but once they do, the statistics will go away.

Would it make sense for us to host our own instance?

cc/ @knative/steering-committee @knative/technical-oversight-committee @knative/productivity-wg-leads

markusthoemmes commented 3 years ago

Interesting. I think keeping the statistics alive would be very desirable, though I don't know if we need to host the entire thing necessarily. More static reports might be enough too if hosting our own instance would be an issue.

jberkus commented 3 years ago

The EC met today and realized there's a second issue with teststats: since it's not maintained, any knative/ repos created in the last year aren't represented.

We can host our own devstats. It's an OSS project, and is even packaged to run on a kubernetes instance. That said, setup is complicated.

We could also use another service, like bitergia or something.

mattmoor commented 3 years ago

I believe that we used to host our own, and stopped for some reason. Maybe @chaodaiG or @chizhg recall, but it may predate them.

chaodaiG commented 3 years ago

There used to be a self-hosted devstats, and even hosted on knative serving. The effort was shelved due to complicated reasons, including data inconsistency compared to cncf hosted stats, as well as member leaving the team. The source code was not merged iirc, should be able to recover from https://github.com/ericKlawitter/test-infra if desired

bsnchan commented 3 years ago

Looking at where we have historically used these metrics, the metrics which are important IMO are broken down as follows:

For TOC and Steering Elections:

For the annual report (this was at least the data used in the 2019 report):

Nice metrics to have which currently aren't captured are:

pmorie commented 3 years ago

I'm working on getting us a cluster for devstats.

jberkus commented 3 years ago

Brenda,

* We would need Contribution count by company as well (to ensure that we maintain the no vendor majority rule)
* Companies contributing within a 1 year period

This information is already inaccurate in Teststats. When Knative didn't join CNCF last year, their staff stopped maintaining the company affiliation information, which has to be manually updated. When we do our own thing, someone will need to be assigned to maintain company affiliation data, because there is no automatable source for it.

* Members in the community (can be determined by knative-dev@ group)

As a warning, Google has added new restrictions on exporting member lists from Groups, so this might not be a usable data source anymore either.

* Gender Diversity (this has been moved from devstats, so maybe there is a reason that it has been removed)

It was removed due to poor accuracy and no clear way to correct that.

Nice metrics to have which currently aren't captured are:

* Time for PR/Issue engagement

Teststats actually has this: https://knative.teststats.cncf.io/d/10/pr-time-to-engagement?orgId=1

I'm going to mention one possible alternative to running our own devstats, although I'll help with that if that's the way you want to go: Cauldron.

jberkus commented 3 years ago

Never mind. I've been playing with Cauldron, and it's not going to work for us, simply because there's no access to charts for non-authenticated users. In other words, everyone in the project would need an account to see anything.

vaikas commented 3 years ago

@pmorie I think you had some updates for this issue and some findings. Would you mind updating?

pmorie commented 3 years ago

We're working out the last couple kinks using cauldron. You can see the project here. @mbehrendt has made some visualizations and queries that replicate the functionality of devstats, but we are working through a kibana rbac issue before they will work correctly outside his kibana tenant.

vaikas commented 3 years ago

@pmorie thanks for the update! How hard would it be to add the various knative-sandbox/* repos there also?

pmorie commented 3 years ago

I've added the knative-sandbox repos to the cauldron instance. I have a thread open with our cauldron instance owners about making the rbac permission changes so that we can share the visualizations.

pmorie commented 3 years ago

We are very close, just waiting to complete the step of exporting the dashboards @mbehrendt created so they can be made public.

vaikas commented 3 years ago

Should be good to go, let's compare the cauldron with the metrics we are expecting to track. @bsnchan @mbehrendt to check it out.

vaikas commented 3 years ago

Dashboard is done, access issues still being worked on. Still verifying the numbers, they seem to be off. @mbehrendt is working on verifying the contributor stats, there are many many unknown orgs/sources that seems odd. Checking on this to make sure numbers are right before starting to rely on that.

vaikas commented 3 years ago

Can we throw some money at this problem? Currently this is way too hard and hosted cauldron strips out the PII which we actually need to gather stats. Bitergia (sp?). Linux Foundation has hosted devstats like system, perhaps we can utilize this: https://insights.lfx.linuxfoundation.org/projects/korg/dashboard @thisisnotapril will set up a test config to see if this solves it for us.

thisisnotapril commented 3 years ago

I signed up for the LF Insights; haven't heard back yet but will ping them!

bsnchan commented 3 years ago

Update: We're still waiting to hear back from the Linux Foundation. @thisisnotapril will send an email to the Linux Foundation to get an update on this

jberkus commented 3 years ago

I'm also going to meet with the cauldron folks; while Paul is on vacation, is there someone who can give me a rundown of how Cauldron was not working? Is it just the PII, or are there other elements?

bsnchan commented 3 years ago

I believe it was just PII but @mbehrendt should have more details that he can share with you.

vaikas commented 3 years ago

Still tracking down the appropriate folks in LF (@thisisnotapril is driving this) Google implementation as well as Cauldron instance had large numbers of folks that were not associated with an org, that probably needs to be figured out. Bitergia is the managed instance of Cauldron.

Most immediate need was for TOC election, and we're good there, so we need to fix this, but is not quite as urgent. But we should utilize the devstats while we have it to validate whatever the new system will be. Running our own instance of devstats will require non-trivial amount of care and feeding from rumor mill.

jberkus commented 3 years ago

Keep in mind that organizational affiliation in devstats is not accurate now.

And yeah, as the other contributor to devstats, I am still not proposing that we run our own instance. Because of Kubernetes' needs, Devstats is designed to basically digest the entire Github change stream, which gives it fairly substantial hardware requirements. We really need something smaller.

vaikas commented 3 years ago

LFXInsights is not taking on any more projects.

vaikas commented 3 years ago

Cauldron update:

Let's try to get a date from LFXInsights so we can see if this is even doable before SC elections next fall.

vaikas commented 3 years ago

Update: @jberkus offered help with looking into tool called Augur. https://github.com/chaoss/augur

geekygirldawn commented 3 years ago

I've been using Augur for quite a while for some of my metrics, so let me know if I can help out in any way!

jberkus commented 3 years ago

@geekygirldawn what are you using for a visualization UI? Augur doesn't seem to come with anything.

geekygirldawn commented 3 years ago

Augur does have a front-end with minimal visualizations build using vue.js and Vega/Vega Lite. More info: https://oss-augur.readthedocs.io/en/master/getting-started/frontend.html https://github.com/chaoss/augur/tree/master/frontend

However, I haven't found the default visualizations particularly useful for me, so I've written my own Python scripts using a combination of postres queries to get the data and a variety of Matlab/Seaborn libraries to build the graphs. Here are a couple of examples from various people within the Augur project that are built into Jupyter Notebooks, which are useful for mocking things up and figuring out / sharing how to do things, but I always convert them to standalone scripts that can run at the command line before I run them for real. https://github.com/chaoss/augur-community-reports/tree/master/templates

bsnchan commented 3 years ago

/assign @bsnchan

geekygirldawn commented 3 years ago

@jberkus I think you mentioned some concerns with the existing Augur front end. I wanted let you know that the Augur team is working on a new frontend using twitter/bootstrap that caches results on the server and will be considerably faster. They expect that beta to be released in the next few weeks.

jberkus commented 3 years ago

Not so much concerns as "I can't build dashboards with this."

bsnchan commented 3 years ago

@jberkus Any luck with your Augur dev environment by any chance?

vaikas commented 3 years ago

Some upstream issues to get fixed before we can use this.

jberkus commented 3 years ago

Yeah, if anyone is watching the Augur issue stream, they can basically track my work :-b

vaikas commented 2 years ago

@jberkus is the plan that for this round of SC elections, we're still going to use the teststats.cncf.io?

csantanapr commented 2 years ago

/unassign @bsnchan

geekygirldawn commented 2 years ago

As part of the move to the CNCF, Knative stats should become part of the regular CNCF devstats instance, so this becomes less of an issue, I think. @jberkus probably knows more about the process for how / when the data will be populated in devstats.

jberkus commented 2 years ago

It'll get populated within a week of acceptance. We'll need to, ourselves, audit company affiliations and send in a PR to bring them up to date.

Given that, let's close this issue.

/close

knative-prow-robot commented 2 years ago

@jberkus: Closing this issue.

In response to [this](https://github.com/knative/community/issues/309#issuecomment-990403756): >It'll get populated within a week of acceptance. We'll need to, ourselves, audit company affiliations and send in a PR to bring them up to date. > >Given that, let's close this issue. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
csantanapr commented 2 years ago

FYI: With Knative moved to CNCF stats are located here https://knative.devstats.cncf.io/