I have compiled a rough definition and design of the computing resource module, and I hope you can join the discussion to reach a consensus on this module.

Compute Node

A computing resource node in the system that should have the following functions:

Install and start several services that support computing
Accept calculation tasks submitted by users and execute them
Schedule these tasks (various tasks may be executed in parallel or queued)
Some preset standard task types, while others are customized by developers
Some computing resources are public, and some may require authorization

Compute Task Manager

The singleton component responsible for managing computing resources in the system should have the following functions:

Accept registration of 'Compute Node'
Accept calculation tasks submitted by users and select appropriate nodes to execute
Maintain load balancing among various computing nodes

Flowchart

Start up

graph TB
subgraph ComputeNode["ComputeNode(node_id, node_entry)"]
    InstallService["InstallService(type, service_entry)"]-->StartService["StartService(type, service_entry)"]-->ServiceList["Services{type, service_entry}"]
end

ServiceList.->RegisterNode

subgraph ComputeTaskManager
    RegisterNode["StartService(node_id, node_entry)"]-->Nodes["Nodes{node_id, node_entry}, Services{type, node_id[]}"]
end

Execute task

graph TB
subgraph ComputeTaskManager
    RunTask["Run(type, params, [node])"]-->SpecifyNode{"if (node)"}
    PostTask["PostTask(type, params, node)"]
    SpecifyNode--yes-->PostTask
    SpecifyNode--No-->FilterNode["nodes=Services(type)"]-->NextNode["node = nodes.next()"]
    WaitResult["result=WaitResult()"]
end

NextNode.->IsBusy-.yes.->NextNode
IsBusy-.no.->PostTask
PostTask.->ExecuteTask

subgraph "ComputeNode(Any)"
    IsBusy{"is busy"}
end

subgraph "ComputeNode(Selected)"
    ExecuteTask["result=Execute(type, params)"]-->PostResult["PostResult(result)"]
end

PostResult.->WaitResult

I think we can first design a universal task scheduling framework, and then support various execution environments(docker eg.) and preset different task types within this framework.

fiatrete / OpenDAN-Personal-AI-OS

Computing resource scheduling. #36

Compute Node

Compute Task Manager

Flowchart