Open smilesun opened 6 years ago
Hi, not sure why there is no feedback. I think we have the following aims
Based on those goals
we should push the cran package as soon as possible, since it always make sense to be the first one, even if we can not make it perfect. I think it will be a big loss for us if it is not the first R package that can do RL ? And the previous version Markus handled as Mater thesis is already enough to push to cran.
We need to use at least the design Markus presented last week for the publication, and in this way I am confident there would not be too many critism
For future use, I myself think I prefer something I could afect the design of the software but this has nothing to do with our current goal. and I think this should not slow down the first two goals, right ? And maybe in somepoint I could merge my branch into Markus Branch or we just keep as it is.
I need some feedback from you two.
@berndbischl @markdumke
Good points! I thought about merging the stuff from the thesis over into the new design I presented to you and then push to CRAN. I probably need about two weeks to do this (cause I cannot work fulltime on this). But of course we can also push the master thesis version right now to CRAN.
waiting for a few weeks is ok, but we should really target jan for a first upload
Thanks for the reply for both :) I am not sure if the following is a good idea.
Basically my point is maybe we could generate the above mentioned output (cran, oo design, benchmark, paper) stagewise so we are safe to deliver those in time. Otherwise maybe it becomes to complex for us to manage this project and we might get disappointed at the end if it lasts too long.
@markdumke @berndbischl , how do you think ? In this way, we could join force and do not waste time doing repetivive work.
I think it's a good plan. The master thesis version works, so I can push it to CRAN anytime, that is no problem.
and also the user API could be mostly the same although the inside turns into an object-oriented way.
I think there will be a lot of changes also to the user api. But that doesn't matter too much if it is an improvement in usability.
So should I push to CRAN now?
@berndbischl , what do you think ?
@smilesun @berndbischl I have an improved draft for the user interface of the first version online in the reinforcelearn work branch. I would suggest that I finish it this week with at least some basic functionality (Qlearning with or without exp replay, table and keras nn) and then push this to CRAN as version 0.1.0.
For the second version we can then include a better internal oo structure based on Xudong's ideas, and add more algorithms etc., but probably don't have to change too much of the user interface.
# qlearning eligibility traces
env = makeEnvironment("windy.gridworld")
val = makeValueFunction("table", n.states = env$n.states, n.actions = env$n.actions)
policy = makePolicy("epsilon.greedy", epsilon = 0.1)
alg = makeAlgorithm("qlearning", lambda = 0.9, traces = "accumulate")
agent = makeAgent(policy, val, alg)
interact(env, agent, n.episodes = 100L)
# character arguments
env = makeEnvironment("windy.gridworld")
agent = makeAgent("softmax", "table", "qlearning")
interact(env, agent, n.steps = 10L)
@markdumke , there is nothing in the work branch now except for the Rxd folder I pushed, did you push correctly to the right branch?
Ah, yes I've pushed to my own repo: https://github.com/markdumke/reinforcelearn/tree/work
I think that is a good plan
Very good, I will upload tomorrow then
@markusdumke Great work! So what is your plan about the paper now?
Thanks :)
I've just started working in a new job, so sadly there's not too much time I can spend on this project right now.
But I think it would be great to merge our code together at some point, so to have a maintainable and extendable package. So maybe you can make a list of changes you'd like to make to the current code at https://github.com/markusdumke/reinforcelearn, so that I can review these maybe on the weekend? And then we can merge and submit to JOSS?
@berndbischl @markdumke I think we do not have that much time to wait until a perfect design, I would suggest
How do you two think ?
We could start next week and have some form as soon .