awjuliani / successor_examples

Tutorials on learning and using successor representations.
MIT License
50 stars 14 forks source link

Any hints on Deep Successor Representation (DSR) #5

Open SaifAlDilaimi opened 3 years ago

SaifAlDilaimi commented 3 years ago

Hey Arthur, I hope your doing fine! I'm currently implementing a tensorflow version of the paper "Deep Successor Reinforcement Learning" for my master thesis. Somehow the learning is really unstable even on a simple 3d gridworld environment. Have you worked with that paper?

awjuliani commented 3 years ago

Hi @SaifAlDilaimi,

Unfortunately I have not worked with that architecture before, although I am familiar with it. When it comes to deep reinforcement learning models, many of the theoretical guarantees regarding convergence are no longer applicable. As such I am not sure what you should expect from that model on your task. One thing I would recommend is to look at other open source implementations of the same model and see if there are any significant discrepancies between what you are doing and what they did.

Sorry I can't be of more help.

SaifAlDilaimi commented 3 years ago

Hi Arthur,

thank you for answering! Really appreciate it!

I have looked on those few implementations that refer to it. All of them use a dqn that is solely responsible for the q-values. However, in the mentioned paper they compute the q-values by taking the dot product of the successor features and the weights of the reward vector. They also compute a target a' in the loss function which makes no sense to me.

Maybe it's too much to ask, but would you mind taking a few minutes to look at my implementation? You can find it on the following github repo:

https://github.com/SaifAlDilaimi/OpenAIGym-DSR/blob/main/dsr_example.py

Any help, no matter how small, is greatly appreciated.

PS: No worries if thats to much.. Thank you anyway!