MDP of tutorial cards - Githubissues

Firstly, let's recall what's the MDP: A reinforcement learning task that satisfies the Markov property is called a Markov decision process, or MDP. If the state and action spaces are finite, then it is called a finite Markov decision process (finite MDP).

In Wayne's thesis:
- Goal: performance of the learner
- Agent: the control system or the app
- Action: easiness: E = f(E', q)
- Reward: the extent of change of grade
- State: lapse: forget:
  - acq_reps_since_lapse: repetion times of acquisition since lapse
  - ret_reps_since_lapse:
  - scheduled_interval:
  - actual_interval:

FrancisLeon / Reinforement-Learning-

MDP of tutorial cards #4