blueridger commented 3 years ago

Background

The Stacks Advocates program uses SourceCred, which enables low-friction STX distribution to a broader set of Advocates community members and contributors based on their engagement. This distributes stakeholdership to create a more user-owned internet. The current core algorithm of SourceCred, a pagerank-based algorithm called CredRank, is incredibly difficult for non-technical Advocates community members to analyze or interpret, meaning that configuration of the algorithm’s parameters is quite difficult and thus undemocratic. This creates lots of ambiguity about where people’s “cred” comes from, and why STX is distributed the way it is, creating centralization of understanding that excludes most users from engaging in governance of the STX distributions. The CredRank algorithm is also very memory intensive, which places scalability constraints on the Advocates program, and which causes performance and reliability issues for users. This also constrains the possibility of scaling the use of SourceCred to other Stacks communities beyond the Advocates.

Project Overview

We have sketched out an alternative algorithm that we call ClearCred. It produces a very similar result under-the-hood, but it is much more memory efficient and much more interpretable. By making the software’s decisions accessible to non-technical Advocates community members, this change will help STX distributions be more transparent, democratic, and ultimately user-owned. The new algorithm will also open up many long-term benefits by making it easier to create new tools and integrations for Stacks on top of SourceCred, and make it easier for other Stacks communities to onboard to SourceCred if desired.

Scope

The deliverable components are: productionizing the algorithm by wrapping it in an API and a CLI command, productionizing the Discord plugin implementation, and adding timestamps for time-awareness. In common terms, the deliverable will be the use of the new algorithm in place of the old algorithm in the Advocates program’s existing usage of SourceCred. Success will look like positive user sentiment towards the new scores and broader user engagement in discussion of SourceCred weights and scores within the Stacks Advocates program. The initial sketch and some initial analysis is available here: https://github.com/sourcecred/sourcecred/pull/3232

Budget and Milestones

Total Grant Request: 3700 STX 37 Dev/UX hours

M1: Technical Implementation of the new algorithm as a discord-only prototype Deliverable Benefits: As an Advocates community member, I see a more understandable relationship between my engagement and my score, giving me confidence in the program and my value in it. As a maintainer of the Advocates-SourceCred project, I can run the software faster and with fewer errors, so that I can spend more time on other projects. As a Stacks developer, I can more easily create new analysis and governance tools for presenting and integrating SourceCred-provided data into Stacks governance processes. Components: API, CLI, Discord Implementation, timestamping 20 dev hours 2000 STX

M2: Observable Notebook MVP for end-user-friendly contribution/score analysis and weight configuration Deliverable Benefits: As an Advocates community member, I can understand how scores are generated, so that I can give input about how to generate scores for STX distributions in the future. Components: Displays scores for each user, time-sorted and ID-searchable contributions for each user, and human-readable piecewise breakdowns of the math equations that create the score for each contribution, with the weight configuration that was used. 10 dev hours 1000 STX

M3: User Acceptance Testing Deliverable Benefits: Confirmation that the changes have had the desired effect, and feedback for additional changes that could be additionally built. Focus Questions: Do the scores seem accurate to your perception of contribution within your community? More or less than before? Does this provide more clarity on how cred is flowing? Do you feel empowered/enabled to participate in the configuration of weights? Goal: User Test with at least 2 non-technical Advocates community members. 7 UX hours 700 STX

Team

The SourceCred Product Team, with Thena (blueridger) as lead. We are the subject matter experts on SourceCred-related product development and are responsible for virtually all 1st-party development right now.

Risks

We have largely de-risked this project with our initial code draft and analysis, since we have working and proven code, and simply need to productionize it. User Testing might encounter an obstacle of users having little knowledge of the replaced solution due to its inaccessibility, and thus having a hard time providing comparative feedback. In this case, we will focus on non-comparative feedback.

Community and Supporting Materials

This project is informed by the SourceCred community’s internal usage of the software, our partnership with Stacks Advocates, and our experience building on SourceCred and supporting other communities that are using SourceCred. This project aligns well with all that we have learned about SourceCred’s application towards governance, rewards, and community building. It solves so many of the problems brought to us by all kinds of stakeholders and users. Over time, we will communicate about this project to the community using announcements, version release notes, documentation, and social media. We will prioritize getting community feedback at each step. We will coordinate with our partners and crossover members in the Stacks Advocates program to ensure effective communication.

stx-grant-bot[bot] commented 3 years ago

Thanks for submitting a grant proposal. Our team will review your submission and get back to you.

hozzjss commented 3 years ago

Super support this, as this will help our work in expanding decentralized work in the stacks ecosystem

jcnelson commented 3 years ago

So I read your PR to source cred. I guess my biggest question is, how do you know that this algorithm is better at assessing the true value that a contributor creates than CredRank? More generally, how do you know that the algorithm is any good at doing this assessment at all? The rationale you posted in the PR talks about the implementation, but doesn't seem to say anything about the validity of its calculations. Like, if we know out-of-band that Alice produced 10x more value than Bob, does this algorithm correctly determine this from the data (and/or does it better than CredRank)?

Also, maybe this is just my ignorance on the subject, but why is emojiWeight a criterion? Isn't that easy to game, especially if there's money to be made by gaming it?

blueridger commented 2 years ago

@jcnelson fair questions.

how do you know that this algorithm is better at assessing the true value that a contributor creates than CredRank? More generally, how do you know that the algorithm is any good at doing this assessment at all?

Right now, we aren't trying to make the new algorithm "better" at assessing true value. We are trying to keep par with the old algorithm's assessment and make performance/auditability better. As seen in the analysis I did on our own instance (https://github.com/sourcecred/sourcecred/pull/3232#issuecomment-949079112) (https://github.com/sourcecred/sourcecred/pull/3232#issuecomment-949058394), the difference in assessment between the two algorithms is actually very small considering it is a fundamental rewrite of the core logic. This is because the actually usage of CredRank in production instances has been converging towards more DAG-like arithmetic, so the new algorithm does what CredRank is being used to do, only more intentionally/directly. The overall pattern I see in the analysis is that inactive contributors are losing stake and active contributors are gaining stake. With the context I have, I believe this is actually a removal of inaccurate bias towards inactive contributors.

I believe in the future, this new algorithm will enable much more accurate valuation. CredRank forces plugins into generalizations that strip rich data of its nuance. The new algorithm invites plugins to add however much nuance is needed to accurately value contributions.

why is emojiWeight a criterion? Isn't that easy to game, especially if there's money to be made by gaming it?

The use of emojis as social signals indicating community valuation is already in place. As discussed above, the new algorithm is, for now, trying to improve performance, not change behavior.

The risk of gaming was a major skepticism that people had when SourceCred was introduced a few years ago. Today, several long-running instances (1Hive, MakerDAO, MetaGame, SourceCred) have shown us that gaming isn't the concept-smashing burden people feared it would be. There are a few major reasons for the demonstrated resilience:

SourceCred works in tandem with a community's own systems of accountability and moderation that can deal with gaming if it occurs.
SourceCred has resilience factors built in or configurable including the Recent grain policy's bias toward longer histories of cred, discord role weighting, disabled self-reaction cred, and manual audit checkpoints before on-chain execution.
Using peer social signals as our metric means that it's pretty obvious to the community when there is cred in the system that doesn't represent their social signaling, even at larger scales.

After a few years of testing in production, we've seen that there is enough resilience to deter gamers, as there have been few real gaming attempts across the ecosystem. In the end, we see the improved auditability (both manual and programatic) of the new algorithm as providing a net improvement to gaming detection/deterrence.

benoxmo commented 2 years ago

Also, maybe this is just my ignorance on the subject, but why is emojiWeight a criterion? Isn't that easy to game, especially if there's money to be made by gaming it?

1hive had experienced to a large extent this downside. Their community was expanding fast. It was getting worse to get the "value creation tracking" right. But they managed to mitigate it :

a trust level exist in the implementation : https://sourcecred.io/docs/concepts/trust_levels/
you can then define a set of time for which low-trust level persons are unable to mint cred, but they can receive cred
you can turn on/off specific channels in the computation
a narrow range of weights for emojis prevents scores to have a safer (less amplified) distribution

stx-grant-bot[bot] commented 2 years ago

Congratulations. Your grant is now approved. Please complete the on-boarding link here: https://stacks-grant.netlify.app/onboard?q=d06ad10623fb1c5973d68b4b016ebc81

jennymith commented 2 years ago

Hey @blueridger any updates to share on the progress of this grant?

blueridger commented 2 years ago

@jennymith Yeah sure! We are collaboratively coding on this branch: https://github.com/sourcecred/sourcecred/pull/3273

The core algorithm is done with even more features than anticipated. The unit tests for the core algorithm are in progress / in review. The Discord implementation probably should have been it's own milestone, but it's pretty much code-complete and just needs manual tested.

I also have a meeting this week with our UX designer to prepare for M2 and M3.

jennymith commented 2 years ago

Gotcha, thanks for the update @blueridger! Let me know when the tests have been completed and you're ready to move on to M2.

blueridger commented 2 years ago

M1 is now complete. Took a little while because it was underscoped and competing for our attention against other organizational priorities. https://github.com/sourcecred/sourcecred/pull/3273 https://github.com/sourcecred/sourcecred/pull/3295 https://github.com/sourcecred/sourcecred/pull/3299 https://github.com/sourcecred/sourcecred/pull/3303 https://github.com/sourcecred/sourcecred/pull/3310 https://github.com/sourcecred/sourcecred/pull/3312

Simulator tool using the alpha release as proof of function: https://observablehq.com/@sourcecred/credequate-simulator

blueridger commented 2 years ago

!m1_complete

stx-grant-bot[bot] commented 2 years ago

Thank you for completing M1. The grant committee will review and confirm completion or send feedback within a week

blueridger commented 2 years ago

We will not be proceeding to M2 at this time due to dev attrition. Thank you for supporting us to release M1.

stacksgov / grants-program

SourceCred algorithm improvement for Advocates STX distributions #215

Background

Project Overview

Scope

Budget and Milestones

Team

Risks

Community and Supporting Materials