wasmCloud / wasmcloud-otp

wasmCloud host runtime that leverages Elixir/OTP and Rust to provide simple, secure, distributed application development using the actor model
Apache License 2.0
228 stars 48 forks source link

[BUG] `wash up` ensure that the wasmcloud-host can be stopped gracefully #643

Closed Iceber closed 1 year ago

Iceber commented 1 year ago

Is your feature request related to a problem? Please describe. I started two wasmclouds with wash up and they are connected to the same nats. When I stop one of the wash up (without running nats) with ctrl-c, I can still show the stopped host information in the page of the another wasmboard.

After some troubleshooting

  1. I found that wasmcloud-host caches hosts and updates and deletes hosts based on event https://github.com/wasmCloud/wasmcloud-otp/blob/1f276d070d57366135485a0b367e9620eadf172e/wasmcloud_host/lib/wasmcloud_host/lattice/state_monitor.ex#L30-L33

  2. And the event type to delete hosts is com.wasmcloud.lattice.host_stopped. https://github.com/wasmCloud/wasmcloud-otp/blob/1f276d070d57366135485a0b367e9620eadf172e/wasmcloud_host/lib/wasmcloud_host/lattice/state_monitor.ex#L325

  3. VirtualHost.publish_host_stopped is called to send a stop event only when a Wasmcloud Host Application is gracefully stopped https://github.com/wasmCloud/wasmcloud-otp/blob/1f276d070d57366135485a0b367e9620eadf172e/host_core/lib/host_core/vhost/virtual_host.ex#L435

  4. wash up kill wasmcloud-host directly after receiving ctr-c, and does not gracefully stop wasmcloud-host https://github.com/wasmCloud/wash/blob/389a7023b9a6c584d27e2b48573f21e7b09c41ba/src/up/mod.rs#L638-L645

Describe the solution you'd like I want wash up to gracefully stop wasmcloud-host when it receives ctrl-c, for example by sending SIGTERM to it.

This allows the host to broadcast that it has stopped, and ensures that wasmcloud stops gracefully

brooksmtownsend commented 1 year ago

@Iceber great catch here, this is in fact a bug in wasmcloud-otp where hosts are not removed from the washboard. Going to transfer that issue to that repository, for clarity.

For anyone looking to take on this issue, the host should be removed if:

  1. The host_stopped event is received
  2. The host hasn't been heard from in 2 or 3 heartbeat intervals
Iceber commented 1 year ago

@Iceber great catch here, this is in fact a bug in wasmcloud-otp where hosts are not removed from the washboard. Going to transfer that issue to that repository, for clarity.

For anyone looking to take on this issue, the host should be removed if:

  1. The host_stopped event is received
  2. The host hasn't been heard from in 2 or 3 heartbeat intervals

Seems reasonable, can I fix this bug?

brooksmtownsend commented 1 year ago

@Iceber if you'd like to, that would be great! Let me know if I can help point in the right direction or give assistance, and feel free to open up a draft PR with a WIP

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this has been closed too eagerly, please feel free to tag a maintainer so we can keep working on the issue. Thank you for contributing to wasmCloud!