openbaton / NFVO

Repository containing the source code of the NFVO
Apache License 2.0
61 stars 52 forks source link

Can I restart (respawn) a VNF/VM instance if it's not reachable? #175

Closed ashish235 closed 6 years ago

ashish235 commented 7 years ago

Hi Team,

As part of my Fault management rules, I want to restart my application if the process dies and if the VNF (VM) Ip goes unreachable, it should restart the VNF (the VM and not application).

The issue here is my EMS is inside my VM/VNF and if my that goes unreachable I can't take any healing action from my VNMF. A feature or way to invoke Soft/Hard reboot instance would be a nice way to handle VM level failures.

PS: I don't want a ACTIVE_STANDBY configuration here due to application limitation.

lorenzotomasini commented 7 years ago

@flaviomu @ogozman isn't the EMS a service that starts a boot?

@ashish235 the vnfm receives the vim instances where the VNF are deployed. it is in this way possible to invoke a soft reboot.

flaviomu commented 7 years ago

Yes, the EMS should start at system boot.

lorenzotomasini commented 7 years ago

Hi @ashish235

If i understood correctly you are using your EMS and your VNFM. Thus reading another time the issue i am not able to understand what you are referring to:

... I want to restart my application if the process dies and if the VNF (VM) Ip goes unreachable ...

You can implement this in your VNFM

The issue here is my EMS is inside my VM/VNF and if my that goes unreachable I can't take any healing action from my VNMF.

Then i suggest you to implement your solution (VNF and EMS) differently or to invoke the soft reboot as you were suggesting in the first part of the description

If there is no further comment i will close the issue

ashish235 commented 7 years ago

@flaviomu, this is not about starting the EMS but the application which is running inside the vnf, which I want to monitor.

@lorenzotomasini , I 'm using the generic EMS and VNFM supplied with Openbaton. I don't intent to create my own VNFM but rather use what's already available :).

Basically your understanding is correct for the 2nd part, I want to invoke a soft reboot for my VNF in case there is nodata() trigger from zabbix server. So here the FMS should call an API towards openstack VIM to do the soft-reboot.

Then i suggest you to implement your solution (VNF and EMS) My VNF is my own but the EMS I 'm using is your's. The VNFM is also your's, the generic VNFM.

lorenzotomasini commented 7 years ago

The easiest way to reach this goal, from my point of view is to extend the GenericVNFM (if all the other features are enough for your purposes) that will try to do the heal but in case the EMS is not reachable will invoke the soft reboot.

I 'm using the generic EMS and VNFM supplied with Openbaton.

How come that the EMS i not reachable and after a reboot it will be then reachable again? Which scenario are you addressing?

ashish235 commented 7 years ago

@lorenzotomasini, I 'm trying to simulate a scenario where my VM instance itself died or turned unresponsive, so I would to take a healing action by restarting the VM instance.

To give you a background, I was trying Tacker (Official Openbaton project) as my generic VNFM but due to lack of lot of features, I switched to Openbaton and so far I 'm impressed with the features and how generic Openbaton is.

My monitoring plan for a vnf is mainly 2 kinds.

  1. Application down - Invoke the heal script inside the lifecycle.
  2. Host down or unreachable - I 'm referring to Host not reachable ant not the EMS (which anyway means EMS is also not reachable). In that case respawn the host (directly talking to openstack here and not the EMS) now once a host is rebooted, the chance of EMS being reachable is highly likely. This was working quite well with Tacker (as it used to ping the host IP and once it's not reachable, it restarts the VM) as it used HEAT to orchestrate the restarting request. I 'm not sure how can I achieve this with Openbaton.

You mentioned, about extending the VNFM features, so is that the only way to do that?

Regards, Ashish

lorenzotomasini commented 7 years ago

We have a Fault Management System that addresses exactly these 2 issues.

The FMS recognises if the VNF service is down (using monitoring system as Zabbix) and trigger the heal from the NFVO API (this refers to issue 1).

The FMS is also able to recognise if the host is down using ping agent. In this case the action is different from what you are looking for. It will exchange the running (not reachable) VM with another already preconfigured. (switching the state of the VNFComponent from standby to active). I believe this solution if the best one for what concerns limiting the downtime, but it also "occupies" 1 VM (virtual resource) that is not used unless a fail occurs.

If resources are a constraint for you, the only way to do a soft reboot is to implement this feature somewhere but there are many ways to do that depending on the use case and on the reusability of the feature

ashish235 commented 7 years ago

@lorenzotomasini, yes, resource is a constraint for me as I cannot have VM (s) running for failover. That why is why mentioned in my 1st post ACTIVE_STANDBY won't work me sadly. I need 6 active servers (separate VNFs) and to enable a failover, I need to keep 6 stand-by servers, that's too much idle resource for my use case.

So, I think I need to implement this feature by myself, can you advise the best place to put this feature? The only option I can think of here is extending the generic VNFM. Will fork and see.

Thanks.

marcellom commented 7 years ago

Following the ETSI specification, the management of the virtual resources is a task of the VIM. So, in Open Baton the reboot function should be exposed by the vim driver. Then services (NFVO, VNFM, FMS..) could trigger the reboot. For example, the FMS could trigger the reboot upon an alarm. Another solution would be to perform the reboot from the VNFM, but if you wish to use the generic VNFM then it shall be a generic solution (not specific to Open Stack). How to implement it is up to you, but if you want us to merge your feature, which would be great, then I gave you some guidelines.

ashish235 commented 7 years ago

@marcellom, cool thanks for the guidelines. I will study about this a bit and see how it can implemented in most generic way, which can be re-used.

Thanks.

gc4rella commented 6 years ago

Closing, as this is supported by the FMS.