awjuliani / successor_examples

Tutorials on learning and using successor representations.
MIT License
50 stars 14 forks source link

SxS vs SAxS ? #3

Closed vsraptor closed 4 years ago

vsraptor commented 4 years ago

from what I understand you use StateAction x State map..? but you also have 3rd dimention ! Is it SxAxS or SxSxA ! what is the representation ?

Can you elaborate how do you manage and update SAxS map ? How does SAS scenario work ?

PS> From the article it seems it was about SxS map.

awjuliani commented 4 years ago

Hi @vsraptor

The map is AxSxS. It is updated the same way as an SxS map would be, except that the chosen SxS map to be updated is based on the chosen action of the agent.

vsraptor commented 4 years ago

thanks, hmm.. got confused, cause S is first arg, but second index ;\

     experiences.append([state, action, state_next, reward, done])
    s = current_exp[0]
    s_a = current_exp[1]
    s_1 = current_exp[2]
       td_error = (I + self.gamma * self.M[s_a_1, s_1, :] - self.M[s_a, s, :])

Why do you multiply the whole Z axis, shouldn't u do single cell ! or does it has to do something with the observation ! probably vector!