Open functionsoft opened 9 years ago
Hi, both of those agents are using the same algorithm: DQN, but yes the implementation is different on the level of details. I'd use the REINFORCEjs one, it's more recent and complete.
On Fri, Oct 23, 2015 at 9:29 AM, functionsoft notifications@github.com wrote:
Hi,
I'm looking at http://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html
and comparing the agent there with the one at
http://cs.stanford.edu/people/karpathy/reinforcejs/waterworld.html
They are acting in very similar environment, but have different AI implementaitons.
My question is, which is the more advanced and complete AI agent between the two versions?
What are the differences in the neural network implementations and which is more intelligent agent?
Thanks,
Mike
— Reply to this email directly or view it on GitHub https://github.com/karpathy/reinforcejs/issues/8#issuecomment-150626036.
Hi,
Thanks for getting back to me. I’m glad you said that, because that’s the library I chose out of the two to work with and understand.
In the learn function of the DQNAgent there is a comment regarding replay memory, about priority sweeps, how could this be simply implemented with the current code? I assume it involves marking the experience memory with some value that represents good experience vs bad experience? So that the best memories are played back?
Also, the type of neural network implemented in this agent, what is it? Is it a simple multilayer perceptron? Would the agent benefit from more hidden layers?
Any ideas or suggestions greatly appreciated.
Thanks and Regards,
Mike
From: Andrej Sent: Friday, October 23, 2015 6:28 PM To: karpathy/reinforcejs Cc: functionsoft Subject: Re: [reinforcejs] Reinforcejs VS ConvNetjs (#8)
Hi, both of those agents are using the same algorithm: DQN, but yes the implementation is different on the level of details. I'd use the REINFORCEjs one, it's more recent and complete.
On Fri, Oct 23, 2015 at 9:29 AM, functionsoft notifications@github.com wrote:
Hi,
I'm looking at http://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html
and comparing the agent there with the one at
http://cs.stanford.edu/people/karpathy/reinforcejs/waterworld.html
They are acting in very similar environment, but have different AI implementaitons.
My question is, which is the more advanced and complete AI agent between the two versions?
What are the differences in the neural network implementations and which is more intelligent agent?
Thanks,
Mike
— Reply to this email directly or view it on GitHub https://github.com/karpathy/reinforcejs/issues/8#issuecomment-150626036.
— Reply to this email directly or view it on GitHub.
priority sweeps, how could this be simply implemented with the current code? I assume it involves marking the experience memory with some value that represents good experience vs bad experience? So that the best memories are played back?
Seen it done that way in a paper somewhere (can't find it), they added an extra property to the experience objects with a value which was then used to prune experiences.
Hi, mryellow, I am very interesting in the prioritized sweeping with experience replay paper you talk about, can you recall anything that is related to it that I can use to google it?
Not sure I have it saved here, think it may have been an incomplete draft, and not that interesting otherwise.
They were using ReinforceJS, had modified this bit https://github.com/karpathy/reinforcejs/blob/0b9315a69c55f7d66a9d3839a0a90dd067be45db/lib/rl.js#L1091 to include some kind of crude threshold on a score. Believe it was effectively only really looking for actions with a non-zero reward.
One bit that sticks in my head is they were using a Greek alphabet Rho or Psi or something and had in-line comments with it showing properly encoded rather than LaTex or a substitute simple character.
google: "this.learnFromTuple(e[0], e[1], e[2], e[3], e[4], e[5])"
On Learning Coordination Among Soccer Agents
http://robocup.csu.edu.cn/web/wp-content/uploads/2012/12/data/pdfs/robio12-116.pdf
Hangon, only result, but not it, although I've seen this paper before.... and don't think it passed in the score, but checked it before firing learnFromTuple
... So that's a wild goose chase, sorry for the noise.
Thanks, mryellow
There is a new paper in regards to deep reinforcements learning in continous spaces by deepmind. Continuos control with deep reinforcements learning. Is there plans to add this in code form. Many thanks Andrew.
I'm also curious about the deepmind's learnings :D
Hi,
I'm looking at http://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html
and comparing the agent there with the one at
http://cs.stanford.edu/people/karpathy/reinforcejs/waterworld.html
They are acting in very similar environment, but have different AI implementaitons.
My question is, which is the more advanced and complete AI agent between the two versions?
What are the differences in the neural network implementations and which is more intelligent agent?
Thanks,
Mike