TikhonJelvis / RL-book

523 stars 328 forks source link

Check formula for calculating the new state in Process1 #313

Open bhardwajshivam opened 11 months ago

bhardwajshivam commented 11 months ago

The code mentions: return Process1.State(price=state.price + up_move * 2 - 1) Where as its a logistic function of (L-Xt). Why is up_move multiplied by 2 and subtracted by 1?

AI-Ahmed commented 10 months ago

Hello there @bhardwajshivam, Given that the $\mathbb{P}[X{t+1} = X{t} + 1]$ is the probability of of moving up. Likewise, $\mathbb{P}[X{t+1} = X{t} - 1]$ is the probability of moving down. Hence, $\mathbb{P}[X{t+1} = X{t} - 1] = 1 - \mathbb{P}[X{t+1} = X{t} + 1]$ by the complement rule.

If you rearrange the formula, you can say that $\mathbb{P}[X{t+1} = X{t} + 1] + \mathbb{P}[X{t+1} = X{t} - 1] = 1$, which is similar is equal to $2\mathbb{P}[X{t+1} = X{t} + 1] - 1$.

This is similar to the code you mentioned:

Process1.State(price=state.price + 2 * up_move - 1)