Closed mudler closed 2 months ago
@restevens402 I moved this here for branch: https://github.com/masa-finance/masa-oracle/pull/504
The goal of this spike is to replace the existing actor framework for work distribution with a more streamlined and efficient libp2p-based worker distribution system. This transition aims to simplify the codebase, reduce overhead, and leverage the capabilities of libp2p for peer-to-peer communication.
The current system uses an actor framework to manage and distribute work among various workers. This approach, while functional, introduces complexity and overhead that can be mitigated by using libp2p for direct peer-to-peer communication. The new system will utilize the worker_manager.go file to handle work distribution, execution, and response collection.
1. WorkHandlerManager:
Manages the registration and execution of work handlers.
Tracks execution metrics such as call count and total runtime.
Distributes work to eligible workers, either remote or local.
WorkRequest and WorkResponse:
Defines the structure for work requests and responses.
Includes fields for work type, request ID, data, and error handling.
3. Libp2p Integration:
Uses libp2p streams for communication between nodes.
Handles connection establishment, request sending, and response reading.
WorkHandlerManager Setup:
Registers work handlers based on configuration settings.
Uses a map to store handlers and their associated metrics.
Work Distribution:
DistributeWork method selects eligible workers and attempts to send work to remote workers first.
Falls back to local execution if no remote workers are available or successful.
Work Execution:
ExecuteWork method finds the appropriate handler and executes the work.
Uses a context with a timeout to ensure timely execution.
Tracks execution metrics and handles errors.
4. Libp2p Communication:
sendWorkToWorker method establishes a connection to a remote worker and sends the work request.
Reads the response from the stream and unmarshals it into a WorkResponse.
Stream Handling:
HandleWorkerStream method reads work requests from the stream, executes the work, and writes the response back to the stream.
Steps:
1. Configure Multiple Nodes Locally:
- Set up multiple instances of your application on different ports or machines within your local network.
- Ensure each node is configured to recognize the others as peers.
2. Distribute Work:
- Use the DistributeWork function to send work requests from one node to another.
- Verify that the work is correctly received, processed, and the response is sent back.
This card is going to be closed by https://github.com/masa-finance/masa-oracle/pull/504
https://github.com/masa-finance/masa-oracle/pull/504 is merged - closing this one
Problem
To have the oracle functional, a user must:
4001
)These limitations are introduced because we currently use the actor framework to dispatch jobs to the workers, and as an overlay for implementing protocol logics.
This is problematic as well in terms of security, where authentication with workers should be gated somehow to avoid abuse of the workers in the network.
From a code perspective, this is quite limiting too as it basically requires to dial workers directly without
libp2p
, undermining the user experience of the oracle by having strict requirements in terms of connections.Proposed solution
We can use
libp2p
to dial directly workers, and implement the protocol as a set of stream handlers on top of libp2p.Acceptance criteria
Protocol
) which is consumed by the oracle during startup to register stream handling. ( the interface should at least needRegister(node libp2p.Node)
)4001
port open is not needed anymore