"Architecture as Code" (AasC) aims to devise and manage software architecture via a machine readable and version-controlled codebase, fostering a robust understanding, efficient development, and seamless maintenance of complex software architectures
Develop CALM to capture those aspects of resiliency that are decided or influenced by architecture designs choices.
Description of Problem:
Designing systems for resiliency is a complex endeavour.
While it is easy to find literature on resiliency techniques and considerations, there seems to be a lack of practical ways of effectively applying resiliency considerations to architecture designs.
Potential Solutions:
CALM offers the opportunity to capture and persist resiliency considerations as part of system architecture designs and subsequent implementations and can develop into more:
• A structured, practical, and scalable guide for resiliency design.
• Templated resiliency design options.
Leading to better resiliency capabilities:
• Compare & contrast different resiliency design choices.
• Development and identification of resiliency design patterns.
• Improved resiliency measures.
• Targeted resiliency testing.
Next Steps
Create a Framework to articulate Resiliency Requirements
Before a system can be declared "resilient", there needs to be an understanding of what the benchmark is - ideally expressed as a clear set of requirements that need to be met.
Here an outline of a potential framework to capture resiliency requirements:
Definitions
Resiliency - The ability of a system to maintain an "acceptable level of service" in the face of adverse conditions.
Acceptable level of service - The level of service agreed by stakeholders (for whom the system provides services to) that might exhibit some degradation compared to desired performance but is still accepted as performing its function (e.g. the service is not perceived to be fully stopping or halting, to suffer downtime or otherwise involve human intervention such as manual restore or recovery operations).
Scope
System
System's end of any relationships (communications, dependencies)
Architecture of the System
Resiliency
Recovery
Taxonomy
System Rating - custom defined - e.g. 1, 2, 3
Requirement Applicability - custom defined - e.g. Must Have, Optional
Requirement Type - custom defined - e.g. Policy, Implementation
Requirement - custom defined.
Resiliency Requirements Framework (Example)
System Rating
Requirement Type
Requirement
1
2
3
Policy
The system has a clear (stakeholder agreed) defintion of the minimum acceptable level of service.
Must Have
Must Have
Optional
Implementation
The acceptable level of service definition is expressed quantifiably in terms of availability, latency, performance and integrity requirements.
Must Have
Must Have
Optional
Policy
The system has a clear (stakeholder agreed) definition of RPO
Must Have
Must Have
Optional
Policy
The system has a clear (stakeholder agreed) definition of RTO.
Must Have
Must Have
Optional
Policy
The system is portable to run on different platforms and vendor services.
Must Have
Optional
Optional
Policy
The system maintains back-ups of all critical data points.
Must Have
Must Have
Must Have
Implementation
Data Back-ups are taken every X hrs.
Must Have (X=2)
Must Have (X<8)
Must Have (X<400)
Propose a standard set of Resiliency Requirements Definitions
This working group could/should propose standard resiliency requirement definitions to pick and chose from.
Feature Request
Develop CALM to capture those aspects of resiliency that are decided or influenced by architecture designs choices.
Description of Problem:
Designing systems for resiliency is a complex endeavour.
While it is easy to find literature on resiliency techniques and considerations, there seems to be a lack of practical ways of effectively applying resiliency considerations to architecture designs.
Potential Solutions:
CALM offers the opportunity to capture and persist resiliency considerations as part of system architecture designs and subsequent implementations and can develop into more:
• A structured, practical, and scalable guide for resiliency design. • Templated resiliency design options.
Leading to better resiliency capabilities: • Compare & contrast different resiliency design choices. • Development and identification of resiliency design patterns. • Improved resiliency measures. • Targeted resiliency testing.
Next Steps
Create a Framework to articulate Resiliency Requirements
Before a system can be declared "resilient", there needs to be an understanding of what the benchmark is - ideally expressed as a clear set of requirements that need to be met.
Here an outline of a potential framework to capture resiliency requirements:
Definitions
Scope
Taxonomy
Resiliency Requirements Framework (Example)
Propose a standard set of Resiliency Requirements Definitions
This working group could/should propose standard resiliency requirement definitions to pick and chose from.