OriginTrail / OT-RFC-repository

15 stars 3 forks source link

Discussion for OT-RFC-21 Collective Neuro-Symbolic AI #47

Open branarakic opened 2 weeks ago

branarakic commented 2 weeks ago

2024-11-08 21 22 49

Hi everyone,

The latest OT-RFC-21 is out titled "Collective Neuro-Symbolic AI".

This is indeed one of the most long term consequential Requests For Comments, and it details:

Link to RFC PDF: https://github.com/OriginTrail/OT-RFC-repository/blob/main/RFCs/OT-RFC-21_Collective_Neuro-Symbolic_AI/OT-RFC-21%20Collective%20Neuro-Symbolic%20AI.pdf

Please use this Github issue to provide feedback.

Trace On!

hottogo commented 2 weeks ago

Great RFC with exciting developments to support the next growth stage.

Below I have outlined some risks some of these changes bring, as well as a logical solution that accomplishes the same end result without the risks.

One key change was to the core node incentive system, with the addition of a publishing factor. As explained, "the more new knowledge has been published via a specific core node (measured in TRAC tokens), the higher the chance of rewards".

My initial thoughts of concern on this are:

It would be helpful to know the exact factor so we can comment more clearly on how much of a risk this poses. i.e. if it only improves node performance by 5%, it is low impact. However, if it has a 30% performance boost to higher publishing nodes, it will have a very high impact.

The risks explained, are that all staked TRAC shifts to only the highest publishing nodes. Those quickly fill and provide a much higher yield than other nodes. Stakers who are too slow or are too new to the system, don't have a level playing field because the only available nodes to stake on are already full.

An unfair staking system that has a subset of Stakers who earn more than others, does not align with OriginTrail's core values of Inclusiveness and Neutrality. This would create tension and conflict between those who get to stake on the highest publishing nodes and those that missed out.

The risks to centralization are that the entities who have the greatest need for publishing, also control the network through attracting the majority of stake and therefore controlling the nodes that are winning the data. In a hypothetical situation, a high publishing demand entity could break there publishing down between several nodes and have undue influence on the network. They could potentially manipulate data if they control all 3 winning nodes, they could lower or raise the price to publish data to the network etc.

A fairer and more logical approach, would be to provide a rebate/incentive to the node publisher, not to those who are staking against that node. Those Stakers deserve no more Trac reward than a person Staking to another node with similar uptime and stake but less publishing activity.

That way publishers are still incentivized, and the mechanism works much like solar panels as referenced in the RFC, but the staking ecosystem remains fair and equal.

drMurlly commented 2 weeks ago

Firstly, I commend the OriginTrail Core Developers team for the thoughtful development of OT-RFC-21 and for outlining a clear and visionary roadmap for integrating collective neuro-symbolic AI with the Decentralized Knowledge Graph (DKG). The framework demonstrates a deep understanding of how AI and decentralized technology can collaborate to unlock new possibilities in knowledge management, and it’s exciting to see OriginTrail pioneering this area.

However, I have some reservations about the proposed horizontal scaling with DKG Core nodes through the publishing factor. While the intention is to incentivize more knowledge publishing, the current design could inadvertently centralize network activity around high-volume publishers, diminishing the decentralized ethos and network effects that OriginTrail aims to support. As proposed, the publishing factor would operate in a positive correlation, meaning that nodes publishing more knowledge (measured in TRAC tokens) would be increasingly rewarded. This could lead to an imbalanced reward structure, where newer or smaller delegators and node runners face diminished rewards simply due to their lower publishing volumes.

Concerns with the Publishing Factor and Centralization

The publishing factor, as it stands, risks creating a reward mechanism that disproportionately benefits nodes with higher publishing volumes. This approach has several implications:

1. Network Centralization: Rewarding nodes with higher publishing power can create a network where a few larger players dominate, leading to centralization of resources and knowledge control. This could make the network more vulnerable to single points of failure or influence, going against the decentralization principles of blockchain and DKG. 2. Reduced Accessibility for New Delegators: New and smaller delegators would find it increasingly challenging to achieve competitive staking rewards, as they lack the resources to publish knowledge at the same volume as established nodes. This could discourage new delegators from joining, reducing the overall diversity and robustness of the network. 3. Negative Impact on Network Effects: OriginTrail’s strength lies in its decentralized network effects. If the reward system disproportionately favors high-volume publishers, it could reduce overall collaboration, knowledge diversity, and incentivize behavior that prioritizes quantity over quality of knowledge published. 4. Compromising Network Integrity and Fairness: The reliance on publishing as a reward metric could also pressure nodes to publish lower-quality or redundant knowledge to increase rewards, potentially compromising the network’s integrity.

Suggestion: Incentivize High-Publishing Nodes through the Collective Programmatic Treasury (CPT)

A potential solution is to provide additional incentives to high-publishing nodes through the Collective Programmatic Treasury (CPT) program. By allocating rewards for high publishing volumes specifically from the CPT, rather than lowering APRs for all nodes, it would ensure that other nodes and delegators are not indirectly penalized. This approach would create a balanced incentive structure, encouraging more knowledge publishing while maintaining fairness and encouraging broader network participation.

In conclusion, OT-RFC-21 is a groundbreaking step in integrating neuro-symbolic AI with decentralized knowledge management, but the publishing factor should be carefully reassessed to avoid centralization and accessibility issues. By adjusting reward mechanisms, such as through CPT, OriginTrail can create a network where all nodes have equitable opportunities, strengthening the foundation of decentralized knowledge and reinforcing the collaborative spirit at the heart of the project.

DKGunit commented 2 weeks ago

I’ve been a TRAC holder since 2018, and I’m genuinely excited about the project’s direction. It’s been a long journey, but it feels like we’re on the brink of a major breakthrough—like Amazon going from selling books to selling everything! After digesting the information on the publishing factor, though, I do have some concerns as both a staker to a community node and a long-standing community member.

I’m not as technically knowledgeable as some, but I want to add my voice to the points others have raised. If the number of publishings a node handles significantly influences future asset rewards, this could create vulnerabilities, both in terms of centralization and potential manipulation.

I completely understand the goal of attracting major publishers and rewarding them. However, as with any incentive system, there’s a risk of bad actors exploiting it to maximize profits in ways that could harm the DKG. The aim is surely to have valuable, decentralized data flowing through the network, but the current publishing factor approach could encourage mass spamming of low-value data just to increase publishing numbers. It might also channel most traffic through the largest publishers’ nodes.

For instance, if an organization like BSI used the DKG to verify barcode information, it would involve a huge number of publishings. In this case, BSI could operate just a handful of nodes receiving the bulk of the traffic due to their publishing factor bonus. This scenario doesn’t seem to align with the decentralized goals of the network.

These are just my thoughts— I trust the expertise and dedication of the team, and I’m glad to have been along for the journey. Keep up the great work, and thank you! Peace.

SagiOT commented 1 week ago

First of all amazing work! I like the emphasis on builders!

I would go straight to the point and raise 3 concerns I have:

  1. Like mentioned above i see an issue with the updated Core node incentive system. as i understand it the update is trying to address 3 issues: A. the 'hash distance factor' which was razed by the community and that is fixed with the new updated. B. "publishing price per knowledge asset", which is fix by scalability and further more by the 'Node fee' and part with the 'Publishing factor'. C. 'accessibility to publishing... without running a blockchain node' - the last part I don't see how the update is addressing. but for the 'accessibility to publishing' part I see how the Publishing factor helps with reducing this barrier for a new node.

My concern is, in addition to the issues raise above by other comments, there is also an issue with 'screwing' the early adapters which came to action when needed and opened nodes that are not planning on publishing and will find themselves actually late to the party and outside the relevant nodes.

For this i would like to suggest that the Publishing factor in the formula should be divided by a f'(stake size) in such a way that we will get to following:

This way we get an incentive for new publishers to join, and we made sure that if a successful one joins it will make all the sense in the world to add to that one if all other huge ones are caped. Plus in a possible scenario in which we have no new publishers for some time but a demand for nodes it will still make sense for a group to come together and open a non publishing node with the hope of getting a head fast enough before a new publisher comes up. So we improved the 'accessibility to publishing', we pay more for publishers like originally intended ('publishing price per knowledge asset'), and we keep the network decentralized.

2+3: In regards to the Collective Programmatic Treasury. first I don't see why this needs to be only for Neuroweb users, unless the aim is to slowly close down the 'blockchain agnostic' feature and if that is the plan it should be made clear. Second - the halving mechanism, in the first 1-6 years I can see why this is an amazing incentive for builders to publish data which is a crucial step. but once the rewards that one gets back are less then say 10-5% its not a game changer and will not make a difference anymore. so those finale 7.5M TRAC can be used for something with more impact - for example rewarding only new paranets or rewarding 'readers'' the ones that will build great apps that will use the data that is already published etc...

Thanks for asking for our input! Trace on!

haroldboom commented 1 week ago

Great RFC with exciting developments to support the next growth stage.

Below I have outlined some risks some of these changes bring, as well as a logical solution that accomplishes the same end result without the risks.

One key change was to the core node incentive system, with the addition of a publishing factor. As explained, "the more new knowledge has been published via a specific core node (measured in TRAC tokens), the higher the chance of rewards".

My initial thoughts of concern on this are:

* Risk of centralization of control of nodes

* Risk of data concentration on too few nodes

* Unfair staking system that has higher performing nodes that are full so new/other stakers are disadvantaged

It would be helpful to know the exact factor so we can comment more clearly on how much of a risk this poses. i.e. if it only improves node performance by 5%, it is low impact. However, if it has a 30% performance boost to higher publishing nodes, it will have a very high impact.

The risks explained, are that all staked TRAC shifts to only the highest publishing nodes. Those quickly fill and provide a much higher yield than other nodes. Stakers who are too slow or are too new to the system, don't have a level playing field because the only available nodes to stake on are already full.

An unfair staking system that has a subset of Stakers who earn more than others, does not align with OriginTrail's core values of Inclusiveness and Neutrality. This would create tension and conflict between those who get to stake on the highest publishing nodes and those that missed out.

The risks to centralization are that the entities who have the greatest need for publishing, also control the network through attracting the majority of stake and therefore controlling the nodes that are winning the data. In a hypothetical situation, a high publishing demand entity could break there publishing down between several nodes and have undue influence on the network. They could potentially manipulate data if they control all 3 winning nodes, they could lower or raise the price to publish data to the network etc.

* Solution

A fairer and more logical approach, would be to provide a rebate/incentive to the node publisher, not to those who are staking against that node. Those Stakers deserve no more Trac reward than a person Staking to another node with similar uptime and stake but less publishing activity.

That way publishers are still incentivized, and the mechanism works much like solar panels as referenced in the RFC, but the staking ecosystem remains fair and equal.

As a community node runner I 100% support hottogo's solution as it achieves the goals without punishing community nodes. I think the community nodes are essential for the health of the project as they provide the D in DKG and also keep people engaged with the project especially the stakers.

Valcyclovir commented 1 week ago

It's been incredible to watch Origintrail evolve over the years. RFC-21 could very well be the most defining moment for our ecosystem moving forward. Team OTHub is eager to dive into the details of this RFC and offer our thorough insights.

First, let's break down the RFC into more digestible pieces:

the DKG Edge Node Inception Program budget of 750k TRAC is dedicated to builders launching paranets on both the V6 and V8 mainnet, with up to 100k TRAC per builder available as reimbursement for TRAC used for publishing to a particular paranet.

This is a great program to incentivize builders launching new paranets. However, there is a small overlap between this program and the CPT distribution. We hope the team can provide more details and examples on how these 2 incentives play together, or perhaps re-evaluate the ratio between them.

the community of node operators has been indicating the hash distance factor as the most problematic one, causing randomization and impacting the system in an unpredictable and asymmetric way

As always, we truly appreciate how the core dev team responds to community feedback and understand our concerns. We are glad the distance factor will be deprecated on V8, which means we need a new score calculation.

Publishing factor, in positive correlation — the more new knowledge has been published via a specific core node (measured in TRAC tokens), the higher the chance of rewards,

image

This is a great replacement to the V6 incentive formula by removing the distance factor. We believe that publishing nodes should be rewarded, but we are not fully convinced that publishing factor should affect reward chances. If it is included, the impact should be less significant than what we are seeing with distance factor right now. In our view, the incentive formula should prioritize a fair, balanced distribution of rewards to all node operators, create a more predictable APR across the whole network and maintain stability and predictability for publishers, node runners and delegators.

To achieve that those who use the network have incentives to build it in the future, the future development fund will be deployed as a 60MM TRAC Collective Programmatic Treasury (CPT)

We are lacking details on how often the distribution will occur. Is the distribution happening automatically on a specific time? (daily, weekly, monthly, yearly) Or is it bound to a contract each publisher needs to call?

The TRAC released from Collective Programmatic Treasury will be dedicated to (both conditions should be fulfilled) those who: use TRAC tokens for publishing knowledge (paranets spending the most TRAC for publishing knowledge), AND have been confirmed eligible for incentives by the community (paranets who have completed successful IPOs and are deployed on NeuroWeb).

Base chain builders are eligible to receive NEURO incentives should they initiate an IPO. Therefore, we also believe that all builders should benefit from the CPT program regardless of the chain selected. CPT should not be limited to NeuroWeb as the dev fund was created for the purpose to incentivize growth of the entire DKG ecosystem.

Not every paranet on NeuroWeb is by default eligible for the TRAC dev fund emissions. In order to achieve that status, a paranet must have been voted in via the IPO process, gaining support by the NeuroWebAI community through a NEURO on-chain governance vote.

We believe that the TRAC dev fund emissions should not be bound to the NeuroWeb on-chain governance, and that it should be distributed to all and any builders who publishes TRAC on the DKG. NeuroWeb on-chain governance is still very young and the purpose of that governance should be for the emission of NEURO token only, not TRAC dev fund.

In conclusion, we truly appreciate the effort and commitment of the core dev team on this extensive RFC. We strongly encourage the team to host AMAs with the community to discuss this RFC in length and provide answers to our questions.

Regards, Team OTHub

Valcyclovir commented 1 week ago

I personally agree with the stance of OTHub above, but on a personal level, I would like the team to consider an extra parameter on the reward chance formula.

Node fee (formerly “ask”), in negative correlation — the nodes with lower fees are positively impacting the system scalability, and therefore have a higher chance of rewards.

I believe that this new node fee might exacerbate a race-to-the-bottom too quickly before the network is mature enough. I suggest adding a new parameter to the node reward chance formula: suggested node fee. The most important growth metric of the DKG is Total Network Revenue (in TRAC). Therefore, the suggested node fee should self-regulate and drop as the network matures on a predictable manner.

Example: Tier 1: 0 - 100 million network revenue Tier 2: 100 million - 1 billion network revenue Tier 3: 1 billion - 1 trillion network revenue

As the total network revenue grows within the same tier, APR goes up resulting in increased amount of nodes and delegation expansion. As the network revenue reaches the next tier, rewards drop to stimulate the next phase of growth, and so on. The new suggested reward chance formula is as follows:

reward_chance = uptime * (stake - abs(node_fee - suggested_node_fee))

This new formula favors cooperation between publishers and node runners in favor of predictability while penalizing outliers on both ends. This formula outlines an approach that considers total network revenue in the calculation of a suggested node fee.

Regards, BRX

harveysimpsondata commented 5 days ago

OT-RFC-21

Firstly, I want to commend the OriginTrail Core Developers team for their hard work and visionary efforts in developing OT-RFC-21. The inclusion of the publishing factor and other enhancements clearly indicates a deep commitment to advancing the decentralized knowledge graph (DKG) and fostering growth. While I appreciate the intention behind these changes, I want to address some concerns about decentralization and propose adjustments to ensure the reward system remains equitable, inclusive, and aligned with OriginTrail’s core values.

Concerns with the Publishing Factor

The publishing factor, as currently proposed, introduces risks that may undermine the decentralized ethos of the network:

1. Centralization of Control:


Proposed Adjustments to the Formula

I propose modifications to the reward chance formula to address these concerns while preserving the incentive for publishing:

Stake Factor

$$ \mathit{nodeStake} = (\mathit{nodeStake})^\gamma $$

Publishing Factor with Diminishing Returns

$$ \mathit{publishingFactor} = \left(\frac{\mathit{nodeAvgTracSpent}}{\mathit{networkAvgTracSpent}}\right)^{\alpha_{PF}} + \log_2\left(1 + \frac{\mathit{assetSizeKB}}{\mathit{networkAvgSizeKB}}\right) $$

Revised Reward Chance Formula

$$ \mathit{rewardChance} = f_1(\mathit{uptime}, t) \cdot \left( f_2(\mathit{nodeStake}, t)^\gamma + f_3(\alpha\cdot\mathit{publishingFactor}, t) - f_4(\mathit{nodeFee}, t) \right) $$


Suggestions for Improvement

1. Incorporate Uptime and Node Age:

2. Publish Incentives via the Collective Programmatic Treasury (CPT):


Simulation

To better understand the effects of the revised reward chance formula, I implemented a simulation using constants derived from network averages available on othub.io/analytics. The goal of the simulation was to analyze how reward chances vary with different node stakes and TRAC spent values.

Constants Used:

Simulation Results

Below is the resulting plot from the simulation, showing how the reward chance changes with different node stakes and TRAC spent values.

simulation

Reward Chance vs. Node Stake for Different Node's Average TRAC Spent Values:

Conclusion

These proposed adjustments to the reward chance formula are my suggestions for how decentralization might be further supported in OriginTrail's evolving ecosystem. While the current OT-RFC-21 provides a strong foundation for incentivizing publishing and staking activity, the exact formulas for each factor—such as the publishing factor and stake factor—aren’t explicitly detailed in the document. This leaves room for interpretation and experimentation to refine the balance between decentralization, fairness, and network growth.

I recognize that my suggestions are not perfect and may miss critical elements or introduce unintended consequences. Designing an equitable reward system for a decentralized network as complex as OriginTrail is a challenging task, and I fully acknowledge that there are likely gaps in my approach. Despite these limitations, my aim is to foster a discussion that explores how we can further enhance inclusivity and decentralization, ensuring that smaller or newer nodes remain competitive while rewarding meaningful contributions from all participants.

I also want to commend the OriginTrail team for their extraordinary efforts in preparing for V8. The thought and care put into this RFC demonstrate a clear commitment to the growth and success of the DKG ecosystem. Their work continues to push the boundaries of what decentralized knowledge networks can achieve, and I am excited to see how these proposed changes will shape the future of OriginTrail.

Ultimately, I hope these suggestions contribute to a constructive dialogue within the community and provide useful input for refining the reward system. I look forward to seeing the continued innovation and collaboration that have made OriginTrail such a transformative project.