chaoss / community

This is the main CHAOSS community repository. Feel free to open an issue to discuss a topic of community interest! This repository also holds governance, mentorship, and other community-related documentation
MIT License
95 stars 178 forks source link

GSoC Idea: Implement Conversion Rate Metric in CHAOSS Software #305

Closed sgoggins closed 2 years ago

sgoggins commented 2 years ago

Conversion Rate

Question: What are the rates at which new contributors become more sustained contributors?

Microtasks

For becoming familiar with Augur, you can start by reading some documentation. You can find useful information at in the links, below. Grimoirelab also has a set of installation instructions and documentation here: https://chaoss.github.io/grimoirelab-tutorial/

GSoC Students (Primarily, though contributions welcome from Outreachy):

Once you're familiar with Augur, you can have a look at the following microtasks.

Microtask 0: Download and configure Augur, creating a dev environment using the general cautions noted here: Augur https://oss-augur.readthedocs.io/en/dev/getting-started/installation.html and the full documentation here: https://oss-augur.readthedocs.io/en/dev/development-guide/toc.html Grimoirelab https://chaoss.github.io/grimoirelab-tutorial/

Microstask 1: Work on any Augur or Grimoirelab Issue that's Open

Microtask 2: Identify new issues you encounter during installation.

Microstask 3: Explore data presently captured, develop an experimental visualization using tools of your choice. If Jupyter Notebooks against an Augur database/API endpoint collection, use https://github.com/chaoss/augur-community-reports for development.

Microtask 4: Anything you want to show us. Even if you find bugs in our documentation and want to issue a PR for those!

Outreachy Candidates:

The Conversion Rate Metric is not yet defined, and would be defined here: https://github.com/chaoss/wg-metrics-models .

It would be implemented under this working group, so you can follow a pattern of implementation description, using language from the issue itself. https://github.com/chaoss/wg-metrics-models/tree/main/focus-areas/community-engagement

An example of a current metrics model definition is here: https://github.com/chaoss/wg-metrics-models/blob/main/focus-areas/development/metric-model-issue-handling.md

An example of a current metrics model implementation is here, using Jupyter Notebooks, and may also be an inspiration: https://github.com/chaoss/wg-metrics-models/tree/main/implementations/community-welcomingness

Description

The conversion rate metric is primarily aimed at identifying how new community members become more sustained contributors over time. However, the conversion rate metric can also help understand the changing roles of contributors, how a community is growing or declining, and paths to maintainership within an open source community.

Objectives (why)

Implementation

This project could be implemented using either the CHAOSS/Augur, or CHAOSS/Grimoirelab (including stack components noted in references) technology stacks.

The aims of the project are as follows:

Filters (optional)

Visualizations

gsoc-1

Source: https://chaoss.github.io/grimoirelab-sigils/assets/images/screenshots/sigils/overall-community-structure.png

gsoc-2

Source: https://opensource.com/sites/default/files/uploads/2021-09-15-developer-level-02.png

Tools Providing the Metric

Data Collection Strategies

The following is an example from the openEuler community:

gsoc-3

Definition:

References

purna135 commented 2 years ago

Hello, @sgoggins! I'd like to contribute to the project, but I can't seem to find any microtasks that will allow me to do so. Kindly point me in the right direction.

TieWay59 commented 2 years ago

Hi, @sgoggins. My name is Taiwei Wu (also fine to call me Tieway) and I am a graduate student from Shanghai, China. Lately, I've been running grimoirelab examples in my own server with my friends and schoolmates. I don't have enough knowledge of Augur or openEuler Infra, so I am still not sure which one could be best. I contacted some of the matrix contributors (Yehui and Xiaoya) for model details, and I have got enough programming skills in Python. I'd like to dig into this challenge and contribute to CHAOSS.

sgoggins commented 2 years ago

Hi @TieWay59 : If you want to do this for the Google Summer of Code, then all you need to do is complete the microtasks and proposals! Same for you @purna135 !!

eyehwan commented 2 years ago

@purna135 @TieWay59 It's great that you are showing the interests on this task, if you got any any questions, just put it here. I will try to give you the support as I can.

purna135 commented 2 years ago

hello @eyehwan and @sgoggins, Thank you very much for your guidance. I'm looking for advice on:

  1. Which Tech Stack should we use?
  2. What are the project's microtasks?
eyehwan commented 2 years ago

@purna135 Hi Purna, I prefer Grimoirelab, that I am more familiar with , it also means that I could give more support from technical poit of view. But if you prefer Augur, it is fine, then @sgoggins would give you the support. About project's microtasks, @sgoggins could you give some hints on it?

BR//Yehui.

purna135 commented 2 years ago

Thank you, @eyehwan; now we'll wait for @sgoggins.

mabelbot commented 2 years ago

Hi all! I’m Mabel, a prospective MS CS student based in California. I have worked on full stack web application development, API design, machine learning, (mostly in Python) and am currently learning about software analytics. I would love to work with the CHAOSS community and contribute to implementing the conversion rate metric and more.

The documentation and idea description were really helpful in getting started! I already had GrimoireLab up and running, made some observations with the dashboard, and am looking at how a basic metric can be designed and implemented inside GrimoireLab.

SHIVANGISINGH1 commented 2 years ago

@sgoggins Hi I am Shivangi! I am passionate about this project and I started going through the documentation and I am having a few doubts.

Thanks

TieWay59 commented 2 years ago

Hello, @sgoggins and @vchrombie, I'm trying to read the actual code of the hole architecture of grimoirelab, this really slows me down because I can't find some kind of document explaining which repo invokes the others. I have gone through the grimoirelab docs and I figure out that it's grimoirelab-sirmordred which actually holds the whole program.

Now my question can be summed up as follow:

Plz feel free to correct me if I made mistakes, and your kind further information will help me quite a lot! ♥

vchrombie commented 2 years ago

Hi @TieWay59, thanks for showing your interest in this GSoC idea and GrimoireLab.

I'm trying to read the actual code of the hole architecture of grimoirelab, this really slows me down because I can't find some kind of document explaining which repo invokes the others.

You can read about the different components in the GrimoireLab toolset in the docs/components/workflow/. You can also learn the step by step scenario of how GrimoireLab works from docs/components/scenarios/. Please let us know if you need more help.

I have gone through the grimoirelab docs and I figure out that it's grimoirelab-sirmordred which actually holds the whole program.

Yes, you are right in a way. SirMordred orchestrates the whole process (collection, enrichment, identities, panels, etc.)

* How to set up a local dev-env of grimoirelab? (AFAIK, the main repo of grimoirelab stores the configs, and the project itself is pulled from docker hub which contains images of other sub-repos. And yes, It's very nice to use the docker-compose for exploration, I've done it long ago.)

Yes, you are right. The docker-compose configurations are present in the main repo. If you want to just use GrimoireLab as user, you can use the docker-compose method which is pretty simple and quick to use.

* Is it correct (for local dev) to set up grimoirelab-sirmordred, which requires me to clone most other sub-repo into some folders (according to [Getting-Started.md](https://github.com/chaoss/grimoirelab-sirmordred/blob/master/Getting-Started.md))

You are right, if you are looking to set up GrimoireLab as a developer, you need to check the Getting-Started.md and follow the steps accordingly.

TieWay59 commented 2 years ago

Thanks for your quick response @vchrombie ♥.

eyehwan commented 2 years ago

Hi Mabel, Conversion rate has been added as one CHAOSS metric: https://github.com/chaoss/wg-evolution/blob/main/focus-areas/community-growth/conversion-rate.md, here are some definitions of different roles of conversion rate.

BR//Yehui

On Thu, Apr 7, 2022 at 4:22 PM Mabel @.***> wrote:

Hi @sgoggins https://github.com/sgoggins or @germonprez https://github.com/germonprez - I'm seeking some clarification on the statement above in order to flesh out the conversion rate calculation: "Identify casual, regular, and core contributors" - specifically the definition of "casual" contributors for a typical open source software community? I don't find it in the metrics definitions although I did find "occasional contributors". Is "casual" the same as "occasional" (as seen in the Occasional Contributors metric)?

Also, how about the differences between "regular" vs. "sustained" vs. "repeat" (I saw this term in the community reports) definitions? How much do they overlap? My thinking is regular and repeat are similar in definition, but I need some more info on the difference between regular & sustained. I'm also thinking core contributors might be by definition always considered sustained.

Thank you!

— Reply to this email directly, view it on GitHub https://github.com/chaoss/community/issues/305#issuecomment-1091289497, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALW6DC63PXMXJ5BWJBAOSTVD2LNBANCNFSM5PRVOT7A . You are receiving this because you were mentioned.Message ID: @.***>

mabelbot commented 2 years ago

Hi Yehui, let me clarify my question. When I read the conversion-rate.md at the beginning of March, I saw in the Objectives section it states "Identify casual, regular, and core contributors". Since the Onion metric can already "Identify casual, regular, and core contributors" (https://chaoss.github.io/grimoirelab-sigils/common/onion_analysis/), I would like some clarification on how come "Identify casual, regular, and core contributors" is also part of conversion rate calculation? Thanks!

eyehwan commented 2 years ago

@mabelbot Hi Mabel, Union and conversion rate are related, but not exactly same. Union is used to describe current contributors distribution status upon contribution amount, percentage of roles is used to evaluate the robustness of the community. One example is bus factor: https://github.com/chaoss/wg-risk/blob/main/focus-areas/business-risk/bus-factor.md, also there are two papers( related to pareto principle ) :

  1. Revisiting the applicability of the pareto principle to core development teams in open source software projects
  2. A Health Index of Open Source Projects Focusing on Pareto Distribution of Developer's Contribution

Conversion rate is used to descibe the community's ability to attract and retain contributors. It is very common for people to come in community because of some event and then disappear within a month, like running water in a stream. How to make running water stored in a reservoirs?Coversion rate is a proxy metric for that capability. There are also some papers related to it, for exmaple: Sustainability of Open Source software communities beyond a fork: How and why has the LibreOffice project evolved?

Hoping it will help.

BR//Yehui.

ElizabethN commented 2 years ago

@sgoggins am I ok to close this issue now that the GSOC deadline has passed?

vchrombie commented 2 years ago

Congrats to @mabelbot @TieWay59 for getting selected for this project! :tada:

We appreciate the efforts of the other applicants too, who worked really hard on the applications and microtasks. We would love to see you around, contributing to CHAOSS. Please feel free to reach out to us in case you have any concerns.