protontypes / LibreSelery

Continuous distribution of funding to your project contributors and dependencies. Integrated into GitHub Actions
GNU Affero General Public License v3.0
124 stars 18 forks source link

Contribution Distribution Engine (CDE) #164

Open kikass13 opened 4 years ago

kikass13 commented 4 years ago

after our latest meeting, I was playing around with the future weighting stuff and some random git helper scripts regarding git blame.

I will continue the stuff I have visualized in my newest fancy draft - it's not really representing anything but it should show my intentions and definitions while moving forward.

SORRY FOR SPELLING MISTAKES IN THIS TEXT, IT's 3 am DAMMIT! :D

contribution_types_domains_draft

following definitions were used:

how does it work:

  1. ProjectOwner (PO) defines selery.yml by defining contribution domains and their weights. He also has to configure which contribution type/metric should be applied and configure each one individually.
  2. He could for example create a domain called "Documentation" with the "Files" - Type and tell the engine to apply a weight to every contributor who works on *.md files within the repository, lets say 20 people.
  3. After a successful PullRequest, we know each specific contributor to that "Documentations" Domain (we also know how much each one has contributed and when) and have to figure out how all of those contributors (20) are eligible to "earn" money in relation to each other (inside that specific domain). That's where the PO has to define the metrics applied.
  4. For example, our PO could configure a metric which would give the highest payout/probability to the contributors who wrote the most stuff (added the most lines). The metric will therefore change the weights from an equal distribution (1/20 for every contributor) to another distribution based on the lines written by each contributor.
  5. Hurrah! We can now use our new weights to pay out our contributors as normal.
kikass13 commented 4 years ago

Here's my first example of file specific info gathering. I am using git blame to extract all the (hopefully) useful information of local files and their "touches" (aka who changed how many lines at which point in time). My script + git blame outputs the following:

heres my example:

git ls-files | while read f; do echo "\n$f"; git blame -CCC --line-porcelain $f | tests/pythonGitHelper.py; done

and here's the output

CLICK ME TO SEE OUTPUT

``` .git-blame-ignore-revs Arne Döring [2] -- 2020-08-14/16:37:25 [2] .github/FUNDING.yml Tobias Augspurger [3] -- 2020-08-20/15:19:05 [1] -- 2020-08-20/14:56:17 [2] .github/workflows/black.yml Arne Döring [11] -- 2020-08-14/16:08:49 [1] -- 2020-08-14/14:34:01 [10] .github/workflows/seleryaction.yml Tobias Augspurger [44] -- 2020-02-08/11:23:48 [7] -- 2020-08-22/18:35:19 [1] -- 2020-07-16/16:10:23 [4] -- 2020-08-22/18:40:54 [2] -- 2020-02-08/11:20:39 [6] -- 2020-08-20/12:01:43 [2] -- 2020-08-11/22:29:39 [7] -- 2020-08-13/18:20:38 [8] -- 2020-02-08/11:20:04 [3] -- 2020-08-19/09:26:19 [3] -- 2020-08-22/18:48:52 [1] T0b14s Augspurger [71] -- 2020-07-23/15:59:06 [2] -- 2020-07-22/14:16:37 [1] -- 2020-02-08/11:23:48 [2] -- 2020-07-17/17:48:09 [1] -- 2020-02-08/11:20:39 [3] -- 2020-03-28/08:33:31 [9] -- 2020-07-23/14:14:00 [9] -- 2020-07-23/16:12:10 [3] -- 2020-07-18/09:12:25 [29] -- 2020-07-23/17:30:09 [1] -- 2020-03-08/18:35:34 [10] -- 2020-07-23/16:19:38 [1] johannes karoff [1] -- 2020-07-31/16:55:59 [1] .gitignore Arne Döring [3] -- 2020-08-17/15:52:19 [3] Nick Fiege [3] -- 2020-02-08/11:23:48 [3] Tobias Augspurger [120] -- 2020-03-15/09:39:56 [3] -- 2020-02-08/11:20:39 [3] -- 2020-02-08/11:23:48 [1] -- 2020-02-29/09:45:42 [1] -- 2020-08-19/18:01:33 [3] -- 2020-02-10/22:39:29 [1] -- 2020-02-08/11:20:04 [108] johannes karoff [1] -- 2020-07-22/16:24:14 [1] Dockerfile T0b14s Augspurger [3] -- 2020-02-28/22:24:07 [1] -- 2020-02-28/21:43:44 [2] Tobias Augspurger [25] -- 2020-02-24/22:10:17 [4] -- 2020-02-28/23:47:20 [2] -- 2020-02-08/11:20:04 [14] -- 2020-02-08/11:20:39 [5] kikass13 [20] -- 2020-08-06/22:38:34 [20] Gemfile Hendrik Radke [1] -- 2020-03-21/15:52:01 [1] Tobias Augspurger [2] -- 2020-02-08/11:20:04 [1] -- 2020-02-24/22:10:17 [1] LICENSE Tobias Augspurger [661] -- 2020-02-08/11:20:04 [661] README.md Arne Döring [6] -- 2020-08-16/19:19:06 [6] T0b14s Augspurger [12] -- 2020-03-22/08:40:18 [2] -- 2020-02-16/14:02:36 [1] -- 2020-07-03/00:02:46 [1] -- 2020-03-21/09:04:05 [1] -- 2020-02-08/11:20:39 [3] -- 2020-07-26/18:57:52 [2] -- 2020-02-24/20:33:17 [2] Hendrik Radke [1] -- 2020-03-21/15:52:01 [1] Felix Dietze [29] -- 2020-08-21/19:51:51 [29] Tobias Augspurger [122] -- 2020-08-14/17:05:22 [2] -- 2020-08-17/08:38:30 [1] -- 2020-08-14/20:27:52 [3] -- 2020-08-22/08:50:51 [11] -- 2020-02-08/11:20:04 [3] -- 2020-08-14/15:01:23 [3] -- 2020-08-06/10:57:33 [1] -- 2020-08-22/18:25:36 [1] -- 2020-08-15/10:33:53 [15] -- 2020-08-16/12:32:28 [7] -- 2020-02-24/22:29:28 [1] -- 2020-02-08/11:23:48 [2] -- 2020-08-14/23:22:56 [1] -- 2020-08-16/11:23:35 [9] -- 2020-08-16/11:00:25 [9] -- 2020-08-13/14:59:30 [1] -- 2020-08-22/18:10:53 [3] -- 2020-08-19/17:46:07 [22] -- 2020-08-15/12:25:25 [4] -- 2020-08-21/12:31:53 [8] -- 2020-08-22/18:29:35 [1] -- 2020-03-14/13:28:49 [14] build.sh kikass13 [1] -- 2020-02-08/11:23:48 [1] johannes karoff [1] -- 2020-07-31/16:54:50 [1] docs/OpenSelery-04.png Traceback (most recent call last): File "tests/pythonGitHelper.py", line 15, in line = input() File "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 349: invalid start byte docs/selery_workflow.png Traceback (most recent call last): File "tests/pythonGitHelper.py", line 15, in line = input() File "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 306: invalid start byte openselery/__init__.py openselery/coinbase_connector.py Tobias Augspurger [23] -- 2020-02-24/22:10:17 [1] -- 2020-02-08/11:20:04 [8] -- 2020-08-04/10:32:22 [3] -- 2020-02-08/11:23:48 [5] -- 2020-03-14/22:46:41 [3] -- 2020-07-23/17:24:15 [3] kikass13 [3] -- 2020-02-08/11:23:48 [3] Johnny CrckMc [4] -- 2020-02-08/11:20:38 [4] Arne Döring [15] -- 2020-02-08/11:23:48 [1] -- 2020-08-14/16:29:09 [14] openselery/collection_utils.py johannes karoff [6] -- 2020-07-23/18:02:33 [6] Arne Döring [10] -- 2020-08-14/16:29:09 [10] openselery/commandline.py kikass13 [4] -- 2020-02-08/11:23:48 [1] -- 2020-08-06/22:49:10 [3] Tobias Augspurger [9] -- 2020-08-12/00:14:44 [2] -- 2020-08-10/15:27:45 [1] -- 2020-02-24/22:10:17 [1] -- 2020-08-11/19:17:41 [1] -- 2020-08-10/13:34:45 [2] -- 2020-02-10/22:39:29 [2] johannes karoff [37] -- 2020-07-23/18:02:33 [3] -- 2020-07-31/17:18:18 [8] -- 2020-07-31/16:48:13 [26] Arne Döring [77] -- 2020-02-08/11:23:48 [9] -- 2020-08-14/16:29:09 [64] -- 2020-08-17/15:52:19 [4] openselery/commit_identifier.py johannes karoff [40] -- 2020-08-19/17:33:24 [40] openselery/configuration.py Nick Fiege [5] -- 2020-02-08/11:23:48 [5] Arne Döring [7] -- 2020-08-14/16:29:09 [6] -- 2020-02-08/11:23:48 [1] Tobias Augspurger [45] -- 2020-08-20/14:01:56 [45] johannes karoff [24] -- 2020-08-19/17:33:24 [1] -- 2020-07-31/16:48:13 [19] -- 2020-07-31/17:18:18 [4] kikass13 [47] -- 2020-08-17/22:20:38 [31] -- 2020-02-08/11:23:48 [7] -- 2020-08-18/18:06:09 [9] openselery/git_utils.py johannes karoff [11] -- 2020-08-19/17:33:24 [11] Tobias Augspurger [39] -- 2020-03-15/11:48:26 [1] -- 2020-08-20/14:01:56 [8] -- 2020-02-08/11:23:48 [19] -- 2020-02-29/10:14:33 [11] Arne Döring [4] -- 2020-08-14/16:29:09 [4] openselery/github_connector.py Tobias Augspurger [24] -- 2020-02-08/11:23:48 [7] -- 2020-03-08/20:29:37 [9] -- 2020-03-08/21:15:20 [1] -- 2020-08-05/13:37:11 [3] -- 2020-03-08/21:03:24 [3] -- 2020-02-08/11:20:04 [1] johannes karoff [5] -- 2020-07-30/13:41:13 [5] Arne Döring [29] -- 2020-02-08/11:23:48 [16] -- 2020-02-09/03:32:20 [9] -- 2020-08-14/16:29:09 [4] kikass13 [27] -- 2020-02-08/11:23:48 [27] Johnny CrckMc [1] -- 2020-02-08/11:20:38 [1] Nick Fiege [12] -- 2020-02-08/11:23:48 [12] openselery/librariesio_connector.py kikass13 [47] -- 2020-02-08/11:23:48 [47] Nick Fiege [5] -- 2020-02-08/11:23:48 [5] Tobias Augspurger [4] -- 2020-02-08/11:20:04 [4] Not Committed Yet [1] -- 2020-09-08/02:01:06 [1] Arne Döring [40] -- 2020-02-08/11:23:48 [23] -- 2020-08-14/16:29:09 [17] openselery/openselery.py Nick Fiege [40] -- 2020-02-08/11:23:48 [40] kikass13 [88] -- 2020-08-18/18:06:09 [4] -- 2020-08-06/22:49:10 [36] -- 2020-02-08/11:23:48 [48] johannes karoff [26] -- 2020-07-31/15:31:50 [1] -- 2020-07-23/18:02:33 [2] -- 2020-07-30/13:41:13 [1] -- 2020-07-31/16:48:13 [5] -- 2020-08-19/17:33:24 [15] -- 2020-08-13/19:39:00 [2] T0b14s Augspurger [3] -- 2020-02-24/20:33:17 [3] Arne Döring [224] -- 2020-02-09/03:32:20 [1] -- 2020-08-17/16:02:01 [7] -- 2020-08-14/16:29:09 [190] -- 2020-02-08/11:23:48 [21] -- 2020-08-17/15:52:19 [5] Tobias Augspurger [248] -- 2020-03-08/18:29:20 [2] -- 2020-07-23/15:45:31 [1] -- 2020-02-29/10:14:33 [1] -- 2020-03-15/09:39:56 [2] -- 2020-08-22/18:25:36 [21] -- 2020-08-10/00:52:14 [18] -- 2020-08-04/11:25:56 [1] -- 2020-08-15/08:54:03 [1] -- 2020-08-10/15:27:45 [3] -- 2020-08-19/17:46:07 [1] -- 2020-08-20/16:55:02 [1] -- 2020-08-10/09:42:57 [16] -- 2020-08-04/10:32:22 [11] -- 2020-02-10/22:39:29 [4] -- 2020-08-13/12:10:24 [7] -- 2020-02-16/19:58:40 [7] -- 2020-08-22/11:35:18 [10] -- 2020-08-20/14:01:56 [10] -- 2020-08-22/18:10:53 [1] -- 2020-03-15/20:28:36 [5] -- 2020-08-09/12:38:54 [8] -- 2020-02-08/11:23:48 [16] -- 2020-03-02/19:51:12 [1] -- 2020-08-12/15:41:36 [11] -- 2020-02-16/16:50:11 [3] -- 2020-07-17/15:08:11 [7] -- 2020-08-11/19:17:41 [5] -- 2020-08-09/12:39:21 [5] -- 2020-03-14/23:12:12 [4] -- 2020-08-17/14:13:58 [9] -- 2020-08-20/15:35:58 [9] -- 2020-02-25/23:47:31 [7] -- 2020-08-10/13:34:45 [10] -- 2020-08-09/23:36:16 [8] -- 2020-07-22/13:37:33 [2] -- 2020-07-23/17:24:15 [1] -- 2020-02-24/22:10:17 [4] -- 2020-03-20/21:00:45 [4] -- 2020-03-20/22:16:19 [1] -- 2020-08-10/16:33:05 [1] -- 2020-07-15/20:53:38 [1] -- 2020-08-15/09:39:46 [2] -- 2020-08-20/15:42:47 [1] -- 2020-03-15/11:48:26 [4] -- 2020-08-19/21:44:53 [1] Not Committed Yet [10] -- 2020-09-08/02:01:06 [10] openselery/os_utils.py Arne Döring [8] -- 2020-08-14/16:29:09 [8] kikass13 [13] -- 2020-08-06/22:40:44 [13] openselery/ruby_extensions/scan.rb Tobias Augspurger [16] -- 2020-02-08/11:20:04 [16] Hendrik Radke [1] -- 2020-03-21/15:52:01 [1] openselery/selery_utils.py Arne Döring [18] -- 2020-02-08/11:23:48 [7] -- 2020-08-14/16:29:09 [11] Felix Dietze [2] -- 2020-02-08/11:20:39 [2] Nick Fiege [46] -- 2020-02-08/11:23:48 [46] Tobias Augspurger [6] -- 2020-02-08/11:23:48 [1] -- 2020-02-08/11:20:04 [1] -- 2020-08-09/12:38:54 [4] kikass13 [20] -- 2020-02-08/11:23:48 [20] openselery/visualization.py Arne Döring [112] -- 2020-08-14/16:29:09 [112] Tobias Augspurger [2] -- 2020-08-05/13:37:11 [1] -- 2020-08-05/23:09:25 [1] johannes karoff [74] -- 2020-07-23/18:02:33 [1] -- 2020-07-23/17:22:46 [19] -- 2020-08-13/19:39:00 [54] kikass13 [22] -- 2020-08-06/22:43:58 [22] run.sh johannes karoff [1] -- 2020-07-31/16:54:50 [1] Tobias Augspurger [12] -- 2020-02-08/11:20:04 [4] -- 2020-02-10/22:39:29 [1] -- 2020-03-15/20:28:36 [3] -- 2020-02-08/11:20:39 [1] -- 2020-08-22/18:10:53 [1] -- 2020-02-08/11:23:48 [1] -- 2020-03-20/22:16:19 [1] kikass13 [22] -- 2020-08-06/22:38:34 [19] -- 2020-02-08/11:23:48 [3] scripts/selery Tobias Augspurger [1] -- 2020-02-08/11:20:04 [1] johannes karoff [2] -- 2020-08-13/19:38:05 [1] -- 2020-07-31/16:48:13 [1] Arne Döring [2] -- 2020-08-17/15:52:19 [1] -- 2020-02-08/11:23:48 [1] Nick Fiege [1] -- 2020-02-08/11:23:48 [1] selery.yml Tobias Augspurger [37] -- 2020-08-15/08:54:03 [3] -- 2020-02-10/21:08:47 [11] -- 2020-08-22/11:35:18 [9] -- 2020-02-10/22:39:29 [1] -- 2020-08-22/11:46:21 [1] -- 2020-08-11/19:17:41 [5] -- 2020-03-08/18:29:20 [1] -- 2020-08-10/15:27:45 [2] -- 2020-08-13/12:18:48 [1] -- 2020-08-10/13:34:45 [3] johannes karoff [11] -- 2020-07-31/15:31:50 [2] -- 2020-08-19/17:33:24 [9] Not Committed Yet [1] -- 2020-09-08/02:01:06 [1] Arne Döring [4] -- 2020-02-08/11:23:48 [3] -- 2020-08-16/19:19:06 [1] setup.py kikass13 [10] -- 2020-08-06/15:12:21 [9] -- 2020-08-06/22:36:30 [1] Tobias Augspurger [7] -- 2020-08-10/22:49:50 [1] -- 2020-02-08/11:20:04 [5] -- 2020-08-17/09:14:32 [1] Arne Döring [25] -- 2020-08-14/16:29:09 [25] tests/just_clone.py Arne Döring [14] -- 2020-08-14/16:29:09 [8] -- 2020-02-08/11:23:48 [6] Tobias Augspurger [18] -- 2020-02-08/11:20:04 [18] tests/random_bibliothecary.py Arne Döring [46] -- 2020-08-14/16:29:09 [14] -- 2020-02-08/11:23:48 [32] Tobias Augspurger [43] -- 2020-02-08/11:20:04 [43] tests/random_clone_docker.sh Tobias Augspurger [7] -- 2020-02-08/11:20:39 [3] -- 2020-02-08/11:20:04 [4] ```

for now git blame fu** up when dealing with binary files (which is not unheard of). And apparently while testing this I've had non-committed files inside my directory, meh :)

Ly0n commented 4 years ago

Nice @kikass13 I think it is a really good approach. I went through the git blame and it is indeed a good indicator.

will replace/takeover the current gather/weight/split functionalities based on a dynamic / freely configurable framework

When you replace / takeover the existing architecture try to keep the existing functionality or even enhance it. The uniform weights and activity weights are quite important even if they are not that complex. I will today start to build some demo script to get into the coordination weights. I think the names file weights and coordination weights are quite good. @krux02 @cornerman @fdietze What is your opinion?

kikass13 commented 4 years ago

@Ly0n well these are not "weights" per se . These are just a classifier needed for someone to configure what he wants to express ...

to make my thought process clear:

as you can see, I am a little confused about how metrics play their role here. I don't really know (right know I don't even have a slight clue) how we will configure, declare & apply metrics to a contribution domain. If someone has an idea, please give me some insight

kikass13 commented 4 years ago

Disclaimer

To document what I did the last two days, here's a diagram depicting the data flow: I will describe the image below ... just for curious people, flow starts at the top left side ;)

CDE

What happens:

... here is where the config fun starts ...

... what happens with all this? ...

... now the CDE is ready and can finally start working? ...

Done

any questions? No ? im going to bed now!

yarikoptic commented 4 years ago

I am wondering about one additional aspect: historical perspective. Initial figure shows use of git log but I have not spotted it in further discussions. I think that may be useful to add "historical decay rule" (also configurable -- faster decay coefficient would accent on most recent states/contributions)

Then that "combined" split is what would be used to decide on how/whom to split current funds allotment.

kikass13 commented 4 years ago

general

@yarikoptic We talked about a concept (which I did also mention in my examples somewhere) called "time degradation". I guess it's the same thing you mean. I like the idea of including time (absolute and differential) as a means to empower "fresher" contributions.

Me and other folks talked a little bit about it in here:

132

Regarding the current development

The example i coded (for the CDE) which is currently free for review and further improvements (see my fork here: https://github.com/kikass13/libreselery/tree/cd_engine) includes a plugin based scoring system (small example) of git blame.

It gathers

So with that plugin, it is technically possible to score newer contributions better than older ones. That's just an example though as the concept of "time" is a difficult one to configure properly.

post

In case you have any suggestions or want to help me putting a little example of what you said into code, I would be happy to get some help <3

Ly0n commented 4 years ago

Meeting Note: We should name the "Actions" the image "Activities" because "Actions" is already been used by Github Actions.

kikass13 commented 4 years ago

Update from commit 31601a4:

I changed some of the internal stuff, bit the most important thing is that there is a plugin which does the same as the previous gather and weight() functions. It is not identical and a lot of stuff is missing. But the flow works well now and can be altered to fit whatever was before

Whats in there now:

kikass13 commented 4 years ago

in case you want to look into it (@fdietze @cornerman) (my fork is here: https://github.com/kikass13/libreselery/tree/cd_engine)

kikass13 commented 4 years ago

After the successful little meeting with @cornerman and @fdietze I changed some of the internal behavior and cleaned up the code. The main talking points were:

Domains

Plugins

Contributor Data

Engine

It was decided that all bold formatted points are relevant prior to a first PR. The last commits should address all of these "bold" points :)