masa-finance / masa-oracle

Masa Oracle: Decentralized Data Protocol 🌐
https://developers.masa.ai/docs/masa-protocol/welcome
MIT License
24 stars 19 forks source link

feat(twitter): Enhanced Twitter Worker Selection Algorithm #591

Closed teslashibe closed 1 month ago

teslashibe commented 1 month ago

Description

This PR implements an improved worker selection algorithm for Twitter tasks in the Masa Oracle project. The goal is to balance between prioritizing high-performing workers and ensuring fair work distribution.

Key Changes

1. Modified GetEligibleWorkers function (pkg/workers/worker_selection.go)

2. New getTwitterWorkers function (pkg/workers/worker_selection.go)

3. New calculatePoolSize function (pkg/workers/worker_selection.go)

4. New SortNodesByTwitterReliability function (pkg/pubsub/node_event_tracker.go)

5. Updated NodeEventTracker.GetEligibleWorkerNodes (pkg/pubsub/node_event_tracker.go)

6. Enhanced logging

Implementation Details

Twitter Worker Selection Process

  1. Get eligible worker nodes from NodeTracker.GetEligibleWorkerNodes(category)
  2. For Twitter category: a. Calculate pool size using calculatePoolSize b. Select top performers based on the calculated pool size c. Shuffle the selected top performers d. Create Worker objects from the shuffled pool, respecting the original limit
  3. For other categories:
    • Return all eligible workers without modification

Node Sorting for Twitter Reliability

The SortNodesByTwitterReliability function uses a multi-criteria approach to rank nodes:

  1. Prioritizes nodes with more recent last returned tweet
  2. Then by higher number of returned tweets
  3. Considers the time since last timeout (longer time is better)
  4. Then by lower number of timeouts
  5. Deprioritizes nodes with more recent last not found time
  6. Finally, sorts by PeerId for stability when no performance data is available

Benefits

TODO

Testing