Closed kevinbradner closed 5 months ago
It seems like R(s)
is being used in the transition function here which causes the failure. When you remove the default missing
argument for the action in reward you're also removing the single argument reward method.
Thanks, that certainly shows one area where my signature would cause an issue. Since R seems to be called with more arguments in other places, do you know the expected semantics for R(s)?
With that said, I'm still confused about the stack trace described in my earlier message. It's not necessarily an issue for this repo, but it would be helpful for my future Julia work if anyone can help explain that.
Sure thing, we set up some default fallbacks for reward, but it seems that we don't include reward(m, s)
as a fallback.
In terms of the stack trace, I can see how it would be confusing that the reward(::P,::S,::A,::S)
seems to be a required method. However, because of these fallbacks whenever you define reward(m,s,a)
, then reward(m,s,a,sp)
is also automatically defined. So, you don't need to manually define the extra method.
Hope this helps
Ok sure, thanks again for the information. I took the time to edit my notebook and test it a moment ago, and the DiscreteValueIteration code runs. I'll go ahead and close the issue.
I am trying to make some modifications to the MDP used in the Julia Academy tutorial on MDPs. I get an error when using the vanilla DiscreteValueIteration solve function after trying to redefine my MDP's reward function R to have the signature
function R(s, a)
instead offunction R(s, a=missing)
.I started from the original notebook and made some minimal changes to recreate the error. My version of the notebook is here.
@req reward(::P,::S,::A,::S)
is a line from the requirements section of the vanilla solve function in this repo. Based on that line, it looks like I need to define a reward that takes a (S, A, S') triple, but this does not seem to be the case when using the function.The details are in my notebook linked above, but when I run the following lines:
I get this error:
The error suggests that the issue has something to do with the requirements macro. The requirements list as well as the rest of the code in vanilla.jl makes it look like rewards with larger parameter lists should work here. I'm pretty new to Julia, so I may just be misunderstanding something here. If anyone can tell me whether I am missing something, it would be greatly appreciated.