Open nashif opened 8 years ago
by Benjamin Walsh:
This is an enhancement: why is this set to the highest priority level ? Should probably be a medium, high at most.
by Benjamin Walsh:
Also, what do you mean by a "free" thread ?
by Benjamin Walsh:
Also, this should have been created as a user-story, not a task.
by Tushar Patel:
Sorry Benjamin my mistake. I have updated it to Story. I consider this as a useful feature that could be added in the future roadmap.
by Benjamin Walsh:
Tushar Patel No worries. Sounds like something that could easily be added in the error catching mechanism.
Can you comment on what you mean exactly by a "free" thread ?
by Tushar Patel:
Benjamin Walsh Basically thread is task that is running on its own and scheduled by scheduler. A scheduler cannot do anything if task has some exception or it stopped or blocked forever, the supervisor can do many things as described in the description to prevent this failure.
by Benjamin Walsh:
Still unsure how a "free thread" differs from just "threads".
by Andy Ross:
FWIW: this sounds more like an application design paradigm than a kernel feature. There's nothing preventing an app from implementing something like this if it wants, but I'm not at all sold on the idea that Zephyr should enforce such a framework (which requires a whole extra thread to be allocated!) either.
Do we really intend to commit to deliver this as a kernel API in 1.8?
by Sharron LIU:
moving back to "NEW" status for requirement clarification. Tushar Patel , could you feedback to the above questions from Benjamin Walsh and Andy Ross ? Thanks.
Reported by Tushar Patel:
If there are some free threads running then there should a Supervision task to Monitor the state of each Task. The concept is very similar to the Actor Model system and it has been used in many Coherent and Fault tolerant systems like [Erlang|https://www.erlang.org/] and [Akka|http://doc.akka.io/docs/akka/snapshot/general/supervision.html].
When a free running Task (or threads) detects a failure (i.e. throws an exception), it suspends itself and sends a message to its supervisor, signaling failure. Depending on the nature of the failure, the supervisor can do followings things.
1) resumes the task and the associated data intact 2) restarts the task and clear the associated data 3) block the task permanently 4) escalate the failure to the master supervisor and stops itself
(Imported from Jira ZEP-965)