masa-finance / masa-bittensor

Bittensor Subnet Config
https://masa.ai
MIT License
9 stars 11 forks source link

spike: Improve Bittensor Subnet Reward System #225

Closed teslashibe closed 2 weeks ago

teslashibe commented 3 weeks ago

Problem Statement

Currently, our Bittensor subnet's reward system for Twitter data scraping is based solely on whether the provided data is valid or not. This binary approach doesn't incentivize miners to continuously improve the quantity and quality of data they provide. As a result, we may be missing opportunities to gather more valuable data and encourage healthy competition among miners.

Proposed Solution

Implement a more sophisticated reward mechanism that takes into account multiple factors:

  1. Quantity of valid tweets (we should not verify every tweet)
  2. Quality of tweets (this needs thought)
  3. Continuous improvement over time
  4. A kurtosis-based distribution to reward top performers while maintaining broader participation

Goals

Key Considerations

Acceptance Criteria

Example Implementation

import numpy as np
from scipy.stats import kurtosis

class MinerData:
    def __init__(self, miner_id):
        self.miner_id = miner_id
        self.tweet_counts = []
        self.quality_scores = []

class IncentiveMechanism:
    def __init__(self, n_epochs=10, kurtosis_factor=2):
        self.miners = {}
        self.n_epochs = n_epochs
        self.kurtosis_factor = kurtosis_factor

    def add_miner_data(self, miner_id, tweet_count, quality_score):
        if miner_id not in self.miners:
            self.miners[miner_id] = MinerData(miner_id)

        miner = self.miners[miner_id]
        miner.tweet_counts.append(tweet_count)
        miner.quality_scores.append(quality_score)

        # Keep only the last n_epochs of data
        if len(miner.tweet_counts) > self.n_epochs:
            miner.tweet_counts = miner.tweet_counts[-self.n_epochs:]
            miner.quality_scores = miner.quality_scores[-self.n_epochs:]

    def calculate_miner_score(self, miner):
        avg_tweets = np.mean(miner.tweet_counts)
        avg_quality = np.mean(miner.quality_scores)

        # Calculate improvement factor
        if len(miner.tweet_counts) > 1:
            tweet_improvement = (miner.tweet_counts[-1] - miner.tweet_counts[0]) / len(miner.tweet_counts)
            quality_improvement = (miner.quality_scores[-1] - miner.quality_scores[0]) / len(miner.quality_scores)
            improvement_factor = 1 + (tweet_improvement + quality_improvement) / 2
        else:
            improvement_factor = 1

        return avg_tweets * avg_quality * improvement_factor

    def distribute_rewards(self, total_reward):
        scores = [self.calculate_miner_score(miner) for miner in self.miners.values()]

        # Apply kurtosis transformation
        transformed_scores = np.power(scores, self.kurtosis_factor)
        total_score = sum(transformed_scores)

        rewards = {}
        for miner, score in zip(self.miners.values(), transformed_scores):
            rewards[miner.miner_id] = (score / total_score) * total_reward

        return rewards

# Example usage
def simulate_network(n_miners=5, n_epochs=15, total_reward=1000):
    mechanism = IncentiveMechanism(n_epochs=10, kurtosis_factor=2)

    for epoch in range(n_epochs):
        print(f"Epoch {epoch + 1}")
        for miner_id in range(n_miners):
            tweet_count = np.random.randint(100, 1000)
            quality_score = np.random.uniform(0.5, 1.0)
            mechanism.add_miner_data(miner_id, tweet_count, quality_score)
            print(f"  Miner {miner_id}: Tweets = {tweet_count}, Quality = {quality_score:.2f}")

        if epoch == n_epochs - 1:  # Distribute rewards on the last epoch
            rewards = mechanism.distribute_rewards(total_reward)
            print("\nFinal Rewards:")
            for miner_id, reward in rewards.items():
                print(f"  Miner {miner_id}: Reward = {reward:.2f}")

            reward_values = list(rewards.values())
            print(f"\nReward distribution kurtosis: {kurtosis(reward_values):.2f}")
            print(f"Reward distribution stats: Min = {min(reward_values):.2f}, Max = {max(reward_values):.2f}, Mean = {np.mean(reward_values):.2f}")

# Run simulation
simulate_network()

Integration with Bittensor Subnet

Example usage:

  1. Implement quality scoring for tweets:
def calculate_tweet_quality(tweet):
    # Implement your quality metrics here
    # For example:
    relevance_score = assess_relevance(tweet)
    uniqueness_score = assess_uniqueness(tweet)
    engagement_score = assess_engagement_potential(tweet)

    return (relevance_score + uniqueness_score + engagement_score) / 3
  1. Modify forward function to use the new mechanism:
class Subnet(torch.nn.Module):
    def forward(self, tweets):
        valid_tweets = self.validate_tweets(tweets)
        tweet_count = len(valid_tweets)
        quality_score = np.mean([calculate_tweet_quality(tweet) for tweet in valid_tweets])

        # Add data to the incentive mechanism
        self.incentive_mechanism.add_miner_data(self.miner_id, tweet_count, quality_score)
  1. Update reward distribution logic:
def distribute_rewards(self):
    total_stake = sum(self.stakes.values())
    rewards = self.incentive_mechanism.distribute_rewards(total_stake)

    for miner_id, reward in rewards.items():
        self.update_stake(miner_id, reward)

Future Ideas

This creates a more dynamic and competitive environment for miners in our Bittensor subnet. By considering the quantity and quality of tweets, as well as miners' improvement over time, we expect to see an increase in the overall value of data collected. The kurtosis-based distribution will reward top performers while still maintaining incentives for a broader range of participants.

teslashibe commented 3 weeks ago

We need to consider quality of data without constraining the universe of possible outcomes and not limit and bias selection. Keeping data scraping open and using a proxy for quality, like downloads from HF.

teslashibe commented 3 weeks ago

Refined Initial Solution: v0.8.0 release

Implement a more sophisticated reward mechanism that takes into account multiple factors that rewards higher performing miners:

  1. Quantity of valid tweets as defined by the count returned in a request for the GET Tweets/Posts endpoint.
  2. A validator requests continuous data from miners as a synthetic request job from a centralized queue for now
  3. Organic requests can still be sent to the network through a validator
  4. Round robin miner selection, if a miner returns 429 or no workers eligible, skip to the next miner. Issue here is latency will increase. We will worry about latency later. Set a parameter or constant that we can change for the number of miners selected in the round robin before return, "no miners available to fulfill the request"
  5. Use a static representative JSON comparison using the LLM i.e. comparison >0.50
  6. A kurtosis-based distribution to reward top performers while maintaining broader participation
  7. Generate an Jupyter model showing the curves for the new Kurtosis curve

Next iteration: future release

  1. Think about how similarity check can be compared to a source of truth for 'quality'
  2. Quality of tweets (this needs thought)
  3. Continuous improvement over time

How can miners self-optimize for higher rewards

Miners can work independently to setup private clusters/networks using the Masa Protocol that is configured as its own private network with its their own set of Twitter Scraper Nodes. This allows miners to optimize and grow their capacity on the network which benefits the subnet by rewarding miners that bring the greatest capacity to the network. This capacity is harnessed by validators to scrape static data sets that are ranked by download volume in HF as well as organic requests that are submitted to validator API endpoints by developers who want real time data access.

Screenshot 2024-08-28 at 11 30 51 AM
Luka-Loncar commented 3 weeks ago

@teslashibe can I close this spike now since we have a solution proposal and proceed with creation of follow up tasks?

teslashibe commented 3 weeks ago

@Luka-Loncar keep it open until we link the implementation ticket - I have moved to in-review for now. We might want to move this to a v0.8.0 release on the roadmap and push back v1.0.0 lmk your thoughts. We can then cut a release with this and @hide-on-bush-x @grantdfoster 's miner axion request bug fix ✋

Target for this release is next Monday - 2nd September

Thanks

Luka-Loncar commented 3 weeks ago

Ok, I will create v0.8 release card and add this and ping miner ticket into it. It can be small but meaningful release.

teslashibe commented 3 weeks ago

Ok, I will create v0.8 release card and add this and ping miner ticket into it. It can be small but meaningful release.

Great exactly - lets target Monday for cutting v0.8.0

grantdfoster commented 3 weeks ago

Adding PR / branch here for tracking #227

teslashibe commented 3 weeks ago

@grantdfoster moved this spike to done ✅