zephyriot / zephyr-issues

0 stars 0 forks source link

Supervisory and Monitoring Task #908

Open nashif opened 8 years ago

nashif commented 8 years ago

Reported by Tushar Patel:

If there are some free threads running then there should a Supervision task to Monitor the state of each Task. The concept is very similar to the Actor Model system and it has been used in many Coherent and Fault tolerant systems like [Erlang|https://www.erlang.org/] and [Akka|http://doc.akka.io/docs/akka/snapshot/general/supervision.html].

When a free running Task (or threads) detects a failure (i.e. throws an exception), it suspends itself and sends a message to its supervisor, signaling failure. Depending on the nature of the failure, the supervisor can do followings things.

1) resumes the task and the associated data intact 2) restarts the task and clear the associated data 3) block the task permanently 4) escalate the failure to the master supervisor and stops itself

(Imported from Jira ZEP-965)

nashif commented 8 years ago

by Benjamin Walsh:

This is an enhancement: why is this set to the highest priority level ? Should probably be a medium, high at most.

nashif commented 8 years ago

by Benjamin Walsh:

Also, what do you mean by a "free" thread ?

nashif commented 8 years ago

by Benjamin Walsh:

Also, this should have been created as a user-story, not a task.

nashif commented 8 years ago

by Tushar Patel:

Sorry Benjamin my mistake. I have updated it to Story. I consider this as a useful feature that could be added in the future roadmap.

nashif commented 8 years ago

by Benjamin Walsh:

Tushar Patel No worries. Sounds like something that could easily be added in the error catching mechanism.

Can you comment on what you mean exactly by a "free" thread ?

nashif commented 8 years ago

by Tushar Patel:

Benjamin Walsh Basically thread is task that is running on its own and scheduled by scheduler. A scheduler cannot do anything if task has some exception or it stopped or blocked forever, the supervisor can do many things as described in the description to prevent this failure.

nashif commented 7 years ago

by Benjamin Walsh:

Still unsure how a "free thread" differs from just "threads".

nashif commented 7 years ago

by Andy Ross:

FWIW: this sounds more like an application design paradigm than a kernel feature. There's nothing preventing an app from implementing something like this if it wants, but I'm not at all sold on the idea that Zephyr should enforce such a framework (which requires a whole extra thread to be allocated!) either.

Do we really intend to commit to deliver this as a kernel API in 1.8?

nashif commented 7 years ago

by Sharron LIU:

moving back to "NEW" status for requirement clarification. Tushar Patel , could you feedback to the above questions from Benjamin Walsh and Andy Ross ? Thanks.