JuliaReinforcementLearning / ReinforcementLearning.jl

A reinforcement learning package for Julia
https://juliareinforcementlearning.org
Other
581 stars 112 forks source link

Overspecific types in built-in policies and algorithms #1066

Open dharux opened 7 months ago

dharux commented 7 months ago

The types in built-in policies and algorithms like QBasedPolicy and TDLearner are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.

For example, QBasedPolicy is defined as struct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy and all the methods for it similarly. Therefore, I cannot write a new learner and use it in a QBasedPolicy, even though all the methods for it seem to be very general.

Another example is TDLearner which is defined as Base.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}. However, the constructor for it only allows M=:SARS. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).

In my opinion, these restrictions should be removed and they should be replaced with general types such as AbstractLearner.

jeremiahpslewis commented 7 months ago

Hi @dharux,

Thanks for your feedback and clear explanation of your concerns.

Two points:

  1. Looking at the code for q_based_policy.jl, there is no reason it can only support TDLearner, so please feel free to submit an appropriate pull request and I’ll review it. :)

  2. TDLearner is not an abstract type, and ensuring it supports arbitrary approximations is beyond scope. That said, earlier versions of TDLearner supported algorithms beyond :SARS and supported LinearApproximators here and here; it would be great to have these re-included in this package ecosystem, the best place at the present would be in the ReinforcmentLearningFarm package (part of this repo). If you want to pursue adding them back, let me know and I can give you some pointers.

dharux commented 6 months ago

Thanks for the reply.

  1. I have submitted a pull request for this issue.

  2. I would be interested in adding the algorithms back into the package. Some guidelines would definitely be useful.