Open cexuzheng opened 1 year ago
Hi, I found your work in the Model-Based Reinforcement Learning book along with this pyhton toolbox very interesting. However, while executing the code I have found some minor issues.
- In the MBRLtool.py file, in the Class Objective, you missed writing the self.Q and self.R for the stage cost function
- In the SysID class, in the update function for the RLS, you call utils.RLS instead of utils.identification_RLS
- In the utils.identification_RLS, the equations didn't work quite right, so I made some modifications:
def identification_RLS(W,S,dXdt,Theta,forget_factor=0.9): g=np.matmul(S,Theta)/(forget_factor+np.matmul(np.matmul(Theta.T,S),Theta)) S=(1/forget_factor)*(S-np.matmul(g.reshape(S.shape[0],1),np.matmul(Theta.T,S).reshape(1,S.shape[0])) ) W=W+np.matmul( (dXdt-np.matmul(W,Theta)).reshape(len(dXdt),1) ,np.matmul(Theta.T, S).reshape(1,len(Theta))) return [W,S]
- I might suggest to improve the definition of the SysID class by choosing the desired lam factor for the SINDy algorithm.
Hope this can help and share better your fabulous work.
I also had some questions about the Quadrotor example, could you introduce me the parameters that you used to get those examples of the book? I found the very interesting, but didn't quite got them so quickly and stable.
Thank a lot
Ce Xu
Hi Ce Xu,
Thank you for your interest and feedback.
Q and R values are particularly designed for each system. So it is taken as an input in the cost function. You can find their numerical values defined for each example in its main.py file given in their folders.
About utils, there might be some inconsistency, as you mentioned. There used to be one utils file for each folder. I unified them when uploading, but I may have forgotten to check for all the examples. Thank you for your feedback on this. When I get a chance, I will update them and make sure they run without error, also taking into account the points you mentioned.
RLS in utils is taken from some other source, as far as I remember. Thank you for revising the code. I will double-check it for all the examples and update them.
Regarding the SINDy algorithm, yes, depending on the system dynamics, lam can be further tuned for better results. The choice of lam is crucial, as it is used as a threshold to filter out the coefficients to get sparse dynamics.
For the quadrotor, all the parameters can be found in its folder within different files. If you are looking for a specific parameter, I can help you with that. Please note that, as mentioned in the book, there are a few pre-runs involved in the process of learning. This helps the system identifier collect some useful data and converge faster. After pre-runs, we proceed with closed-loop learning and control in a few other episodes. Overall, the provided code for this example should converge within 7–10 episodes, as is. If you changed the parameters and/or maybe the RLS implementation, there might be some further tuning needed.
I hope this helps to answer your questions, and again, thanks for your comments.
Milad
Hi, I found your work in the Model-Based Reinforcement Learning book along with this pyhton toolbox very interesting. However, while executing the code I have found some minor issues.
def identification_RLS(W,S,dXdt,Theta,forget_factor=0.9): g=np.matmul(S,Theta)/(forget_factor+np.matmul(np.matmul(Theta.T,S),Theta)) S=(1/forget_factor)*(S-np.matmul(g.reshape(S.shape[0],1),np.matmul(Theta.T,S).reshape(1,S.shape[0])) ) W=W+np.matmul( (dXdt-np.matmul(W,Theta)).reshape(len(dXdt),1) ,np.matmul(Theta.T, S).reshape(1,len(Theta))) return [W,S]
Hope this can help and share better your fabulous work.
I also had some questions about the Quadrotor example, could you introduce me the parameters that you used to get those examples of the book? I found the very interesting, but didn't quite got them so quickly and stable.
Thank a lot
Ce Xu