Closed thehunmonkgroup closed 9 months ago
Just looked at a few open source solutions, Apache Helix, Mesos which may be of use. I think leveraging an existing solution rather than roll your own might be the better approach. Docker or Kubernetes also may be useful if you abstract the layers to individual virtual nodes.
I like this and I think it should be integrated into the OOB "security overlay" e.g. "System Integrity" responsible for security, operational up status, and so on. That's my opinion, plenty of ways to implement. I'll create an updated diagram to represent what I mean
@thehunmonkgroup thanks for all your thoughts on this. I have revamped the "security" layer into the "system integrity" layer and added plenty of insights: https://github.com/daveshap/ACE_Framework/blob/main/ACE_Framework.md#system-integrity
Great discussions.
@rburgmann thanks for the suggestions!
You may want to have a look at https://github.com/daveshap/ACE_Framework/blob/main/agile.md -- knowing that will help guide your thinking in these earlier, MVP stages of the project. Right now, anything that's not Python, and/or not dead easy to install and set up is probably going to be avoided.
Long term, or for someone taking a run at a serious production ACE now, I definitely think a robust resource manager like the ones you mentioned are essential. My personal favorite is Corosync/Pacemaker.
We're not using the PITCH structure anymore, closing.
PITCH: Resource Manager for ACE Framework
Problem
The Autonomous Cognitive Entity (ACE) framework is a collection of resources designed to function together. However, the framework specification provides no specific implementation for how these resources are managed. This includes starting the ACE, monitoring the components for failure, recovering from failures, and stopping the ACE.
Appetite
The team is prepared to invest approximately two weeks and 20-30 total man hours to address this problem.
Solution
The proposed solution involves creating a 'Resource Manager', a long-running Python process. This process will treat ACE components as 'resources'. Each resource will be managed by a Python agent that operates in a manner similar to the Open Cluster Framework (OCF), specifically implementing its core 'start', 'stop', and 'monitor' methods. The operations are as follows:
Rabbit Holes
Given that the initial ACE is an MVP, the team should not be overly concerned with issues of scaling and performance, beyond those necessary for an individual user of the MVP to have a good experience.
No-gos
In order to maintain focus on the core problem and keep the solution manageable, the following aspects will not be included in this initial implementation: