What if the problem losing weight could be cast as a problem of finding the best policy? Given the context (day eating habits, moving etc), the policy (to be learned) should output an action (what to do in that day). Am I correctly understanding the learning problem? This would be fun to explore.
What if the problem losing weight could be cast as a problem of finding the best policy? Given the context (day eating habits, moving etc), the policy (to be learned) should output an action (what to do in that day). Am I correctly understanding the learning problem? This would be fun to explore.