Open Bolodya1997 opened 3 years ago
From my point of view there is a huge problem with healing - it is not fully covered with tests and so it can be difficult to rework it from the existing solution to a new one without creating new hidden bugs.
We already have:
connect.Server
It is pretty clear, that we need to make some rework both for the Networkservice and Registry healing, because it is not good to use server subchains inside of the client chain. But if we will start this rework right now, there still will be a problem that we don't have well-working tests covering this.
I suppose that it would be better to do all of this in the following way:
connect.Server
- further it will be reworked with the connect.Client
in the same time with the Networkservice healing.@edwarnicke @denis-tingaikin Thoughts?
@Bolodya1997 I presume you are suggesting:
- Create a simple solution for the Client healing and cover it with tests - these tests will be reused in further heal client rework.
- Finish NS, NSE registry healing based on
connect.Server
- further it will be reworked with theconnect.Client
in the same time with the Networkservice healing.- Increase coverage by sandbox tests, k8s tests.
For the 1.0 Release and
- Rework connect server + client.
- Rework heal server + client.
For post NSM 1.0?
@Bolodya1997 One other thing. At this stage, start with system level tests on K8s before you add more sandbox tests. It will help you uncover systemic issues. Those systemic issues may require changes that either:
@denis-tingaikin @edwarnicke I have updated the healing scheme, according to the last pre-release changes and ideas. Please feel free to share your thoughts and comment on this :)
@denis-tingaikin @edwarnicke Few schemes how it should work in local case:
Update: There was a mistake in the old scheme, when heal client / heal server receiving monitor break didn't call gRPC close connection. Fixed.
Structure
1. Healing overview
There are 2 types of healing we will cover in this issue:
And 2 models of application we will cover in this issue:
2. Chains
2.1. Networkservice healing
We have different cases for networkservice healing:
2.1.1. Connect
2.1.2. Monitor
2.1.3. Heal
2.2. NS, NSE registry healing
We have different cases for NS, NSE registry healing:
2.2.1. Connect
2.2.2. Heal