Closed JobJob closed 5 years ago
(cc myself) I will find free time to review this.
Ok thanks - I am just doing a minor cleanup of the speed PRs that follow this.
I want to drop 0.6 support, just go toward v0.7/1.0. Update REQUIRE as well?
Could you enable this feature, so I can push some minor changes directly? https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/
Could you enable this feature, so I can push some minor changes directly? https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/
Yeah "Allow edits from maintainers." has always been checked
TODO
:zzz:
:exclamation: No coverage uploaded for pull request base (
master@09e1e31
). Click here to learn what that means. The diff coverage is55.55%
.
@@ Coverage Diff @@
## master #10 +/- ##
=========================================
Coverage ? 51.28%
=========================================
Files ? 1
Lines ? 39
Branches ? 0
=========================================
Hits ? 20
Misses ? 19
Partials ? 0
Impacted Files | Coverage Δ | |
---|---|---|
src/OpenAIGym.jl | 51.28% <55.55%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 09e1e31...af6d63c. Read the comment docs.
I read the code in Reinforce.jl again.
Found that there is a field total_reward
available in Reinforce.Epsode
,
so why we made similar stuffs in an environment?
I don't really like the Episode abstraction, it was slow - that may be fixable, there were some type instabilities AFAIR, but I'm not sure it's really providing that much benefit. I'm open to being convinced though.
Actually in general I'm of the view that this package should drop the dependency on Reinforce, and we create a separate package that wraps GymEnv as a Reinforce environment, by just defining the relevant Reinforce.xxxx methods, and forwarding them, something like
using Reinforce
using OpenAIGym: OpenAIGym, GymEnv
struct ReinforceGymEnv <: AbstractEnvironment
gymenv::GymEnv
end
ReinforceGymEnv(args...; kwargs...) = ReinforceGymEnv(GymEnv(args...; kwargs...))
Reinforce.step!(env::ReinforceGymEnv, a) = OpenAIGym.step!(env.gymenv, a)
Reinforce.reset!(env::ReinforceGymEnv) = OpenAIGym.reset!(env.gymenv)
...
The other thing about Episode, and having things in iterators in general, is that they sometimes make error messages slightly more obscure. Things are more transparent, and less magical, if you just have a loop.
Anyway I'm not wedded to the idea of total_reward being in the env but it's convenient IMO, and low cost
I don't really like the Episode abstraction, it was slow - that may be fixable, there were some type instabilities AFAIR, but I'm not sure it's really providing that much benefit. I'm open to being convinced though.
The other thing about Episode, and having things in iterators in general, is that they sometimes make error messages slightly more obscure. Things are more transparent, and less magical, if you just have a loop.
well, I want an abstraction. Acctually, I want an iteration protocol on episodic task.
I know Reinforce.Episode
is slow and type unstable, but that's old unmaintained code.
Maybe we can draw a new blueprint about episodic iteration protocol and a concrete implementation. I want this implementation be generic and versitile. Everyone just adds features into it and other package gain new features.
I read the code of Episode again, the current features is barren:
maxn
to set the steps limitation.Actually in general I'm of the view that this package should drop the dependency on Reinforce, and we create a separate package that wraps this package as a Reinforce environment, by just forwarding the relevant methods.
About the package design, I think we can do this trade-off:
Let's put efforts on the primary goal first.
Given that we clarified the primary goal, I can accept that you dropping the Reinforce stuffs at first, even creating a GymEnv not subtyping the AbstractEnvrionment.
Agreed re primary goal, this package should provide a fast way to access Gym environments from Julia via Python.
There is some interesting work going on for all the other stuff at https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl I haven't taken it all in yet, but the design looks quite nice.
so... do you want to make some rework on this PR, or I can merge it now?
Merge it! We can do other stuff later. 🙏
what :interrobang: I don't know this repo before... https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl
Haha, yeah I had the same reaction when I first stumbled upon it a month or so ago
Start of the changes to this and Reinforce that @tejank10 and I discussed at JuliaCon.
Edit: this requires ~some changes to Reinforce - changed to WIP for now~ https://github.com/JuliaML/Reinforce.jl/pull/23
Coming soon: speedups - Gym environments in Julia as fast as from Python.