Closed q138ben closed 3 years ago
You can in theory but need new control parameters for your new target speed. For instance, the current simulation result (and thus the speed) is the result of running the controller with the control parameters ./osim/control/params_2D.txt (https://github.com/stanfordnmbl/osim-rl/blob/610b95cf0c4484f1acecd31187736b0113dcfb73/examples/sim_L2M2019_controller1.py#L29) You can refer to the original paper (https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP270228) about how to find new control parameters, which involves parameter optimization. Sorry that I'm not giving the direct solution but hope this helps.
You can in theory but need new control parameters for your new target speed. For instance, the current simulation result (and thus the speed) is the result of running the controller with the control parameters ./osim/control/params_2D.txt (
) You can refer to the original paper (https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP270228) about how to find new control parameters, which involves parameter optimization. Sorry that I'm not giving the direct solution but hope this helps.
Hi, thanks for the information. I have read the paper and found that you used a cost function J to optimize the control parameters but the cost function seems not including the 82 control parameters in the paper (7 for the reactive foot placement, 40 for stance reflexes, 31 for swing reflexes, and 4 for the modulation in the control transition). Can you please elaborate more on this optimization process? Many thanks.
Hi again. I would like to compare the muscle force under different walking speed in the reflexed-musculoskeletal model. But I found it very difficult to change the walking speed in the model. Can you elaborate a bit more on how to get the control parameters for different walking speeds? Or if you happened to have the control parameters for other walking speeds, would you mind sharing them? Many thanks.
Hi @q138ben. Sorry for the delay. I do not have the parameter sets for different speeds for this model. You can find new parameter sets using parameter optimization (CMA-ES: https://github.com/CMA-ES/pycma). You can set the cost function as f = (v - v_tgt)^2 + integration(act^2).
Hi, @smsong ,Thanks for the reply. I have tried the CMA-ES but I still have some doubts:
How can I set the target forward velocity in order to optimize the cost function f = (v - v_tgt)^2 + integration(act^2)?
I printed out the current forward velocity in state_desc['body_vel']['pelvis'][0] by running the controller in sim_L2M2019_controller1.py as following? Why isn't it constant? Should not it be always similar to the target velocity under the same control parameters? pelvis_vel_0 = 1.7509731410931813 pelvis_vel_0 = 1.7424195369839097 pelvis_vel_0 = 1.6673182753131177 pelvis_vel_0 = 1.5969578108378377 pelvis_vel_0 = 1.5223251638958755 pelvis_vel_0 = 1.4391747108420285 pelvis_vel_0 = 1.356229169471096 pelvis_vel_0 = 1.2825135982282048 pelvis_vel_0 = 1.224548516264549 pelvis_vel_0 = 1.1894554492900906 pelvis_vel_0 = 1.1642403650695012 pelvis_vel_0 = 1.163538658607405 pelvis_vel_0 = 1.1851502985553861 pelvis_vel_0 = 1.2223575020637276 pelvis_vel_0 = 1.2734911564459244 pelvis_vel_0 = 1.321781894640012 pelvis_vel_0 = 1.3525218024187569 pelvis_vel_0 = 1.3662382456478055 pelvis_vel_0 = 1.366698989574385
How can I observe if the optimization goes to the right decision in CMA-ES? I ran the optimization for an hour and still did not get the solution. Also, the cost did not seem to get more stable with time.
In f = (v - v_tgt)^2 + integration(act^2), v_tgt is the target velocity you want so you assign a constant number (e.g. 1.5 m/s). v is what you want to define as your walking velocity and it can be the average of peovlis_vel_0 for the last three steps in the simulation.
The human model is thrown into the simulation with the initial velocity of 1.699999999999999956e+00
and then reaches steady walking over time. Even in steady walking, pelvis_vel_0 fluctuates within the gait cycle as in real humans.
You may need to play around with CMA-ES to have a sense of what population size and initial sigma value would give you the desired result. With a large sigma value (>0.1), the model would fall a lot in the early generations so may need about 400 generations with a population size of 16 to converge, which can take more than a day to run on a modern desktop machine. I would recommend running a CMA-ES trial with sigma=0.01; population size (lambda)=16
for 100 generations
with v_tgt=1.8
and see if the model gets to walk faster.
Hi, @smsong I was actually wondering the same thing in 3D model. I just wanted the model to walk in normal constant speed(1.5m/s), but 'params_3D_init.txt', which is provided, did not work well when I used 'sim_L2M2019_controller1.py' in mode '3d' and difficulty '0'. Do you have any distributed parameters for this case or should I also use CMA-ES to optimize the parameters? Thank you.
- In f = (v - v_tgt)^2 + integration(act^2), v_tgt is the target velocity you want so you assign a constant number (e.g. 1.5 m/s). v is what you want to define as your walking velocity and it can be the average of peovlis_vel_0 for the last three steps in the simulation.
- The human model is thrown into the simulation with the initial velocity of
1.699999999999999956e+00
and then reaches steady walking over time. Even in steady walking, pelvis_vel_0 fluctuates within the gait cycle as in real humans.- You may need to play around with CMA-ES to have a sense of what population size and initial sigma value would give you the desired result. With a large sigma value (>0.1), the model would fall a lot in the early generations so may need about 400 generations with a population size of 16 to converge, which can take more than a day to run on a modern desktop machine. I would recommend running a CMA-ES trial with
sigma=0.01; population size (lambda)=16
for100 generations
withv_tgt=1.8
and see if the model gets to walk faster.
Hi @smsong. I implemented the cost function with the following code and ran a CMA-ES trial with sigma=0.01; population size (lambda)=16
for 400 generations
with v_tgt=1.8
. I took the average cost over the simulation time t
of the episode. But I got no luck in it (the optimization problem was not solved for a 18-hours run). Please correct me if I did it in a wrong way.
sim_dt = 0.01
total_cost = 0
t = 0
while true:
t += sim_dt
locoCtrl.set_control_params(params)
action = locoCtrl.update(obs_dict)
obs_dict, reward, done, info = env.step(action, project=True, obs_as_dict=True)
cost = (obs_dict['pelvis']['vel'][0]- v_tgt)**2 + np.sum(np.square(action)*sim_dt,axis=0)
total_cost += cost
if done:
break
return total_cost/t
@jegyeong-r Yes, unfortunately, I do not have any parameter set ready for the 3D model for the Learn to move environment.
If you use Matlab, maybe checking out the original 3D model may be useful to you: http://seungmoon.com/nmsModel/nmsModel.html. (FYI, this model uses the First Generation SimMechanics, so does not work in the recent releases of Matlab. I think it works in R2018a but now sure with more recent versions.)
@q138ben How does the final solutions look like? Does the human model fall down? To prevent that you should formulate the cost function in a manner that gives high penalties for undesired behaviors (e.g. fall) as shown in equation 1 of this paper: https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP270228.
I think the easiest way to set up such a cost is by running the Learn to Move environment with difficulty=0; model='2D'
then set the cost as something like cost=-total_reward
. You would also need to set ver['ver00']['n_new_target'] = 1.8
in
https://github.com/stanfordnmbl/osim-rl/blob/610b95cf0c4484f1acecd31187736b0113dcfb73/envs/target/v_tgt_field.py#L24
I would recommend to first checkout get_reward_1() first to see if it make sense to you before running CMA-ES: https://github.com/stanfordnmbl/osim-rl/blob/610b95cf0c4484f1acecd31187736b0113dcfb73/osim/env/osim.py#L766-L823
@smsong Thanks for your kind explanation! I'll check the Matlab model!!
-October 19th I've been working on the Matlab model, but I find out some parameters are missing and some parameters are added. I wonder if there are any reasons for this parameter change.
Also, at first, I was thinking of just implementing the parameters would make the osim-rl model work in 3d walking. However, it turned out I have to do more work to find the missing parameters. In this case, do you recommend Matlab model more than the model in osim-rl?
Hi @smsong . Thanks for the suggestion. Now the optimization process went smoother but still came with several questions.
I saved the control parameters with the highest reward as I ran the optimization, say now I have got the best total_reward=
200 until the model fell down. Then I used the saved control parameters as the initial parameters to start another optimation. Yet the optimization fluctuated a lot with many simulations ending with total_reward
around 50 to 100 and the model fell down quite soon in 3s. So does the cma-es algorithm update the parameters stochastically that is not based on the previous best params?
I tested the CMA-ES with the rosenbrock function which was solved in a few seconds. How does the cma-es consider solving the optimization problem in finding the control parameters under the target velocity in my case?
Running the optimization for a day can now get me a total reward
of 229 and a simulation time of 11s before the model fell down. But then the improvement is tiny now. I would assume that I might also need to adjust the init-pose
. If so, do you have any suggestions for me to change the init_pose?
if not, can you point out how I should investigate further?
A: Now I can see the progress by solving my 4th question. But I was still wondering if I should look into changing the init_pose
.
UnboundLocalError: local variable 'reward_footstep_0' referenced before assignment
The aforementioned error always happened after hundreds of simulations so that I had to manually start a new optimization process again each time.A: The problem is solved when I dug the code a bit more.
ver['ver00']['n_new_target']
exactly mean? My assumption is that the control parameters under different walking velocities would be also different. I just realized that I can successfully run the model with the default params_2D
even I set ver['ver00']['n_new_target'] = 100
. This result completely made no sense to me.This depends on how you set up your cma-es. If you set it up correctly, it should be using the parameters you set at init. Regardless of that, I would recommend you to initiate with the "next mean" of the previous cma-es trial instead of the one that had the best reward (let's call it "bestever", because "bestever" may be a local minima.
I do not understand your question. This article might help you to understand how cma-es works (if that is your question): https://en.wikipedia.org/wiki/CMA-ES
If you are using the same init_pose
as the one that is used for successful walking and the cost is set up as I suggested, then it should not be an init_pose
problem, because it clearly can walk without falling with the given init_pose
and walking without falling should give lower costs than falling after 11s. Make sure you set up the cost as I suggested. It should be something like the cost presented in this paper https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP270228 where walking without falling has clearly higher costs than falling (no matter what the average walking speeds were).
4~5. I recommend you to first make sure you understand the cost+simulation so that you can track an issue when it occurs. The reflex-based controller does not take target velocity as input so will work exactly the same for different target velocities. It will just give a lower reward for ver['ver00']['n_new_target'] = 100
because the walking velocity is far from the target velocity.
Hi, @smsong . I have read through the paper https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP270228 and I understand the cost function. It mentioned in the paper that "With different target speeds in eqn (1)c, the control network further generates walking at speeds ranging from 0.8 m s−1 to 1.8 m s−1".
If changing the target velocity does not make the model walk differently, how can I make model walk fast?
@q138ben The reflex-based controller does not include the capability of switching its behavior; so you need a "higher-layer controller" that switches the control parameters based on the behaviors you want. For example, the speed transitions from 0.8 to 1.8 m/s in the paper is achieved with three set of control parameters: first by one for walking at 0.8 m/s, another for transitioning from 0.8->1.8 m/s, and the last for walking at 1.8 m/s. Another way to achieve this sort of behavior control is by training a DNN that regulates the control parameters of the reflex-based controller, which I have not tried but am really interested to pursue. I hope this clarifies the reflex-based controller.
@smsong . Thanks for the reply. I understand that I will need a supraspinal layer to modulate the speed transition. But there should be somewhere coded in the reflex model or the control parameters that can alter the walking speed, right? For example, muscle activation? isometric force? or any other?
For a 2d model, there are 37 control parameters which are not explained explicitly. Can you elaborate a bit?
# control parameters cp_keys = [ 'theta_tgt', 'c0', 'cv', 'alpha_delta', 'knee_sw_tgt', 'knee_tgt', 'knee_off_st', 'ankle_tgt', 'HFL_3_PG', 'HFL_3_DG', 'HFL_6_PG', 'HFL_6_DG', 'HFL_10_PG', 'GLU_3_PG', 'GLU_3_DG', 'GLU_6_PG', 'GLU_6_DG', 'GLU_10_PG', 'HAM_3_GLU', 'HAM_9_PG', 'RF_1_FG', 'RF_8_DG_knee', 'VAS_1_FG', 'VAS_2_PG', 'VAS_10_PG', 'BFSH_2_PG', 'BFSH_7_DG_alpha', 'BFSH_7_PG', 'BFSH_8_DG', 'BFSH_8_PG', 'BFSH_9_G_HAM', 'BFSH_9_HAM0', 'BFSH_10_PG', 'GAS_2_FG', 'SOL_1_FG', 'TA_5_PG', 'TA_5_G_SOL', 'theta_tgt_f', 'c0_f', 'cv_f', 'HAB_3_PG', 'HAB_3_DG', 'HAB_6_PG', 'HAD_3_PG', 'HAD_3_DG', 'HAD_6_PG' ]
@q138ben It is not simple to explain all the parameters but most of those would correspond to the parameters explained in the original paper. For a quick note, PG: P gain; DG: D gain; FG: (positive) force feedback gain; and HFL, GLU, HAM, ... indicates the muscles.
Regarding which parameters you should change to change speed... I would change all the parameters. It would be possible to change fewer parameters to change speed, but the resulting gait may not be human-like. Identifying the minimum number of parameters to change speed while maintaining human-like gait would be an interesting study, which I have done before with a different control model: https://ieeexplore.ieee.org/abstract/document/6225307.
HI, @smsong. Thanks again for the recommended paper. It is really interesting to see how you managed to perform speed transitions. Then I realized that I might not make myself clear in the previous question. I am not looking into the speed transition, e.g from 0.8 m/s to 1.0 m/s during walking. Instead, I just want to find another set of control parameters through CMA-ES optimization to make the model walk at another target speed in steady walking.
2, Using the cost function you suggested, the CMA-ES should minimize the cost function to find a set of control parameters that force the model walks close to the target velocity and minimize the muscle activation etc. But why did I always get the same walking speed and same muscle activation no matter what target velocity I set when the optimization finished?
@q138ben Please find my answers below:
@smsong Let me try to elaborate on my situation. By running the optimized parameters, let's say I got an average speed of 1.4m/s in a 100s simulation and it's always 1.4m/s whenever the model can perform steady walking after optimization. Why did I always get 1.4m/s? If I would like to have an average walking velocity of 1.8 m/s, how can I do?
@q138ben You need a new set of parameters that is optimized for 1.8 m/s. I thought that was what you were asking in Q1.
@smsong . Yes, exactly. But I have already got the new set of parameters by running the optimisation. My initial values of the control parameters are all ones or random values and I got my updated parameters after the optimisation. But why is my optimized result almost the same with the result of running the controller with the control parameters ./osim/control/params_2D.txt? Also, what do you mean that I need a new set of parameters that is optimized for 1.8 m/s? From my point of view, the control parameters is a result of the cma-es optimazation and I have no influence on it.
@q138ben I see. It's not the same but almost the same. Do you think that your CMA-ES trial converged? If not, you can run an optimization for more generations (e.g., 800 gen instead of 400). Also, I would include INIT_POSE
as part of the parameters you optimize. For doing that, you would want to carefully constrain the INIT_POSE
values so that the human model does not start at a weird pose (e.g., a foot penetrating the ground, etc.). Let me know how it goes!
@smsong I included the INIT_POSE
as the new control parameter with the same values in the default controller except for changing the forward speed to 1.2 m/s. Running the CMA-ES optimization again, I ultimately got the model into steady walking. But joint kinematics and muscle activation are still almost the same as the default controller.
The solution shows the model developed a strategy to stable the body first under the 1.2m/s forward speed, then start walking from almost a stand still position. Note that I have tried several times but every time it went intro such strategy.
I wonder if such gait kinematics are somehow encoded in the reflex-based model so that it will eventually walk in the same way when it go into steady walking even under different initial positions.
@q138ben The kinematics are not directly encoded into the controller, but it is not surprising that the optimized gait does not look much different given that human-like gait is dynamically stable/attractive(?) (e.g., exploits the passive dynamics, etc.).
Just to make sure my message got through: I suggested you to optimize both the control parameters and 'INIT_POSE` simultaneously. In that, when optimizing for faster walking, for example, CMA-ES would likely set the initial forward speed to be faster and help the model to steadily walk at a faster speed.
@smsong I agree with you that it is amazing to see that the model always finds a similar approach to perform a human-like gait. But if you look deep into the kinematics, you will find that the ankle plantarflexion is quite small and hip and knee flexion are larger than experimental data, if being critic. And that is why I tried to change the control parameters to make the gait kinematics more human-like.
Just to make myself clear, I have already optimized optimized both the control parameters and 'INIT_POSE` simultaneously. And I have tried an initial forward speed of 1.2 or 1.8 m/s which did not vary much after the optimization. But I am quite frustrated to see that the gait kinematics remained the same at both speed in a optimized steady walking.
@q138ben I have not played much with CMA-ES and the reflex-based controller in the OpenSim-RL environment. But I think it will be able to walk at different speeds based on my experience of doing so in Matlab, and as @carmichaelong did so in a similar environment and a controller: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006993.
FYI, in Matlab, I was able to make the model walk in various weird gaits by tweaking the cost function to reward gaits with bent knees, with minimum use of ankle muscles, with maximum asymmetry between the legs, etc. So the control is capable to produce a wide range of gaits. If you want it to better match your experimental data, you can try to optimize for it (e.g., penalize the deviation from your target kinematics).
@smsong I have now managed to get the model walking in slower and faster speeds by putting more weight on the speed. Thanks. But the computational cost was tremendous that the optimization took days to solve. Then I realized that the optimization used here actually belonged to shooting methods. There is also nonlinear optimization technique called direct collocation which should significantly reduce the computational time. What do you think about this method? Is it possible to put the reflex-based model into direct collocation framework?
@q138ben Glad to hear that you got the model to walk at different speeds! Yes, shooting+CMA is not the most time-efficient optimization approach and direct collocation could tremendously reduce the speed. Direct collocation usually optimizes for one foot step and would need to be creative to encode "robustness" in the solution, where the single-shooting+CMA optimized for multiple steps naturally has. Also, to use direct collocation, you probably would need to apply some computational tricks to make the reflex-based controller to be differentiable, etc. We have been exploring using direct collocation with the reflex-based controller, though it is not our current focus. I would be happy to discuss more through email if you are interested (and you can close this thread if your original issue is solved).
Thanks. I will get in touch via email if I have further questions.
In the examples, sim_L2M2019_controller1.py gives a really good simulation of the model. I tried to change to forward speed in the init pose but the simulation failed in a few steps. I wonder if it is possible to change the forward speed. Thanks.