ariasanovsky / azdopt

An implementation of Alpha Zero for discrete optimization problems.
7 stars 0 forks source link

Terminology should use the language of `Deterministic Finite Automata` and `Semiautomata` #75

Open ariasanovsky opened 10 months ago

ariasanovsky commented 10 months ago

Markov Decision Process

A Markov Decision Process is a $(\mathcal{S}, \mathcal{A}, \mathcal{P}, \mathcal{R})$ where

We don't quite need the generality of a MDP because our transition probabilities are determinstic. I.e., $\mathcal{P}a(s, \cdot) = \delta{s'}$ for some $s'\in\mathcal{S}$.

Deterministic Finite Automaton

A Deterministic Finite Automaton consists of

This is closer to our formulation. Note that:

Across each MCTS, the transition dynamics are shared by a common semiautomaton.

Semiautomaton

A Semiautomaton has

As before, we may accommodate partial transition functions $T:\mathcal{Q}\times \Sigma\to\mathcal{Q}$ with the canonical monadic extension.