mrrezaie / testGitHubActions

WIP test GitHub Actions
0 stars 0 forks source link

Exchanges v2 #4

Open mrrezaie opened 4 months ago

mrrezaie commented 4 months ago

Hi @aaronsfox, I just fixed the issue related to path, and it should work without issue on Linux.

Also, I reverted some of the changes, so the current code is only tracking markers and GRF with the following parameters:

Mesh frequency           100 Hz
Constraint tolerance     1e-3
Convergence tolerance    1e-5
Marker tracking weight   1
Contact tracking weight  1
Control tracking weight  0.001

It was previously converged in 2.5 hours in free GitHub Actions with 4 cores.

And my best effort, the same problem converged in 19 min in free Google Colab with 96 cores: https://colab.research.google.com/drive/1hYD5I-Lj1xUsW4Uc96E-Xp3upqfL5XVs?usp=sharing

Number of Iterations....: 751
Objective...............: 5.4405641775543655e-03
Total seconds in IPOPT..: 1031.273

I'm curious to see how it works on Deakin HPC. Thank you in advance.

mrrezaie commented 4 months ago

I tested this on Deakinโ€™s HPC this morning. Itโ€™s a little hard to evaluate as the problem seemed to converge in a different manner/solution. My solution ended up converging at a much higher number of iterations (1714) and slightly lower objective value (4.7750623e-03). It solved with the 1714 iterations with 5810 total seconds in IPOPT โ€“ so on a per iteration basis this seems slower than the Google Colab (3.39 secs per iteration for Deakin HPC vs. 1.37 secs per iteration for Google Colab) โ€“ but it may not be a like for like comparison given the different convergence.

Hi @aaronsfox, I had seen similar behavior before, different convergence across machines. But your convergence is similar to my GitHub Actions:

Number of threads.......: 4
Number of Iterations....: 1993
Objective...............: 4.7558813e-03
Total seconds in IPOPT..: 10597.528

How many cores did you use for that? 8 cores? I'm going to use identical cores in Google Colab to have a better comparison. Also, the free GitHub Actions allows 20 runs at the same time, each with 4 cores. It's also a good free alternative for batch processing. But, I wish there was no 6-hour time limitation. Does exchanging in GitHub Issues work? I think this would be much more convenient than email when dealing with codes. We can close the issue and keep using it for privacy. Please let me know thought.

aaronsfox commented 4 months ago

Communicating through here is fine @mrrezaie

I'm not entirely sure how the cores etc. work yet on the Deakin HPC. The log has the message of using 96 threads at the beginning of the optimisation. I get the message of 8 threads on my laptop which has 4 cores and 8 logical processors. There is an option with the HPC to set the number of tasks, which I think links to the cores - which I set at 12, yet still get 96 threads when I set this lower at 8. The HPC may give the maximum it has available as well - so again, not entirely sure.

I also noted that in your code you are ignoring activation dynamics. This will speed up the computation but I'm not sure I'd use this if going for accuracy around muscle function - so it may be worth considering this with any speed testing you are doing.

mrrezaie commented 4 months ago

@aaronsfox, thanks for your response. You may already be aware of solver.set_parallel() in Moco which can control the number of using threads (0: none, 1: all available, >1: specific number of threads).

I agree that the muscles in this simulation are not physiological at all. In my experience, activation dynamics only makes the muscles activation pattern smoother, and it doesn't have any significant impact on the muscles force; but tendon compliance does. I'm testing both: https://github.com/mrrezaie/testGitHubActions/actions

Update: All failed ... The number of iterations during the 6-hour simulations suggests that tendon compliance makes the simulation slower. But the activations dynamics makes it faster which actually doesn't make sense . I had asked similar thing on the forum: https://simtk.org/plugins/phpBB/viewtopicPhpbb.php?f=91&t=18498&p=0&start=0&view=&sid=0baecb6c8205afbfc975b527dafb9bd7 Not sure why this is happening.

aaronsfox commented 4 months ago

I was aware but have never used the parallel settings, as I was always just wanting to maximise the speed of the solver and using all cores is default. I think I figured out that the HPC components have 24 cores, as when I set 3 problems to solve with 12 cores in each - after the 2nd it shifted me across to the next compute node. So it seems you can allocate the cores, but interestingly the number of threads remains the same when I change from 12 to 24 cores. Perhaps the threads is something different though and I'll just have to see whether this attempt using 24 cores solves faster to discover if it's a benefit.

I think I agree with Nick in the forum link there that adding activation dynamics may help with convergence properties. It does make the problem more complex, but the activation signals required when using the activation dynamics might simply help the problem solve better. Tendon compliance is a bit painful to get right, as there has been a lot of discussion on the forum about which dynamics mode to use etc. for this to work. Ross Miller's UMocoD code he has on SimTK is probably the most detailed I've seen using tendon dynamics, as it changes the mode and puts appropriate bounds on the normalised tendon forces. If using compliant tendons, I'd only see it relevant to include it with the long tendon muscles in the model (i.e. plantarflexors).

mrrezaie commented 4 months ago

@aaronsfox, Thanks for your response. But, as can be seen in that example, the behavior changes depending on the model:

  • test 1: a model with weak residual actuators: activation dynamics included: 30 iterations (8.9345709e+02) in 57.14s activation dynamics excluded: 36 iterations (8.9339992e+02) in 59.77s

  • test 2: a model with strong residual actuators: activation dynamics included: 32 iterations (1.4798545e-01) in 55.9s activation dynamics excluded: 11 iterations (1.9132318e-01) in 20.2s

In the Muscle Redundancy Solver also, excluding activation dynamics helps reduce the number of iterations and convergence time as well, but the output wouldn't be so smoothed similar to when activation dynamics was included.

In our simulation, I used strong residual actuators, but ended up with different behavior. Also, although including only activation dynamics increased the speed (more iterations in 6 hours), the simulation never converged and the tolerance never met. So, there might be other factors affecting convergence here. Actually, such unexpected behaviors are the reasons why I'm not a fan of Moco ๐Ÿ™ƒ. Thankfully, Moco is heavily under development and improvement, and hope that they make it more robust and reliable.

aaronsfox commented 4 months ago

Thanks for these numbers @mrrezaie. The inconsistency observed in these problems are so difficult to pinpoint due to how large the problems are/the significant number of inputs to them. I don't necessarily few these things as 'inconsistencies' but rather 'complexities' and how it isn't always so simple of changing one thing only changes one thing. I've found this to be a useful perspective in my modelling/simulation work, in that I don't always expect changing the same thing across different problems to have the same outcome, because it never does!

mrrezaie commented 4 months ago

@aaronsfox, would it be possible to run the current code with activation dynamics included on your HPC? I just would like to know how it converges. Thank you in advance.

aaronsfox commented 4 months ago

I think so, if you push the latest version to the repository I can test it

mrrezaie commented 4 months ago

I already did.

aaronsfox commented 4 months ago

Great, I'm currently testing out some other simulations on the HPC but I'll add this to the batch

mrrezaie commented 4 months ago

I do appreciate that.

aaronsfox commented 4 months ago

@mrrezaie I didn't realise you had set the max iterations on this to 10,000 - so I thought I better stop it before it went too far!

I've uploaded the log from the HPC output to this message. It seems like the objective function is quite low but something about this problem is struggling to converge. Perhaps I was hasty in stopping this as it seemed close, but I think something that isn't ready to converge in around 1000 iterations could be better conditioned (i.e. scale the objective function a little better). I've been told before that something where the objective function is converging around 1 is a good scale, whereas this objective function is quite low. Nonetheless I think you could get this problem to work/solve and it is quite speedy on the HPC (~600 iterations per hour).

PFJCF_HPC.log

mrrezaie commented 4 months ago

@aaronsfox thanks for your time.

but I think something that isn't ready to converge in around 1000 iterations could be better conditioned

I'm not entirely sure if this is fair enough, but I appreciate your consideration. It highly depends on the number of available cores. As you saw, similar problem without activation dynamics converged after 751 iterations in Google Colab with 96 cores, but after 1993 iterations in Windows with 4 cores. So, the number of iterations wouldn't be a good measure.

I've been told before that something where the objective function is converging around 1 is a good scale, whereas this objective function is quite low.

Based on my prior experience with CasADi, very high or very low optimal objective functions would have similar performance. I just tested this by scaling the terms in the cost function to ensure: test tol marker GRF control converged iter obj NLP error time (h)
1 1e-5 1 1 0.001 yes 1993 4.75e-03 9.82e-6 2:56
2 1e-5 100 100 0.1 no 3933 4.68e-01 - 5:56
3 1e-3 100 100 0.1 yes 2178 4.86e-01 2.49e-4 3:13

So, objectives between 0.1 and 10 doesn't necessarily help, as well as the inclusion of activation dynamics. Please also note that this is a marker + GRF tracking simulation which is more complicated than coordinate + GRF tracking: link.

Also, I think you may get better performance by allocating specific cores to each of the problems in your batch-processing task. This would lead to more efficient use of your 96 cores, as one problem may monopolize more resources.

aaronsfox commented 3 months ago

@mrrezaie thanks for running these tests . I'm somewhat surprised that the number of iterations changes with different cores/processing. My assumption was that you would end up with the same result but just a different duration (i.e. same iterations, just slower iterations per second). It must be a little more complicated with this around how the parallel processing works and identifies improvements in the objective function to produce another iteration.

My thought process around the final value of the objective function was related to the default or typical convergence tolerances used - i.e. if you have a large objective function but use a very small convergence tolerance I think this would perhaps alter how the problem converges. I don't think there is an exact answer to this though with respect to how you weight your objective function and what the final value comes at - so experimentation seems to be the only way to figure it out.

mrrezaie commented 3 months ago

Hi @aaronsfox, great to hear from you.

If you intend to use the default tolerances, i.e. 1e-3 or 1e-2, then the weight of the goals must be increased in order to have tight constraints. Personally, I prefer to keep the weights around 1, i.e., either unscaled or relative to 1, and reduce the tolerances instead. This would be more efficient computationally (please see the 1st test vs. the 3rd one in the above table). Also, having too small objective value is not uncommon in IPOPT and CasADi.

I tested more things, a torque-driven marker tracking simulation with and without contact tracking goal converged in 27 min (1254 iter) and 10.5 min (535 iter), respectively. So, the contribution of marker and contact tracking goals in the muscle driven simulation is not too much, i.e., most of the 3.5-hour simulation is spent on solving the muscle redundancy problem. These tests were done with the constraint and convergence tolerances of 1e-3 and 1e-5, respectively.

I plotted the predicted GRF and joints moment of the above-mentioned torque-driven marker + GRF tracking simulation, and the output was not good enough. Then, I reduced the constraint and convergence tolerances to 1e-5 and 1e-6, respectively: image As can be seen, the new tighter tolerances really improved the simulation, but also increased the convergence time from 00:27 to 02:07 or 3:00 (1e-6 for both tolerances). Let's call it the required tolerance for the future simulations, meaning that tolerances less than these values wouldn't produce accurate results at all. Now, imagine we include muscles to the very simulation with 1e-6 tolerance, and enable the activation and contraction dynamics; I cannot really imagine how much time it will require to converge ๐Ÿ™ƒ.

mrrezaie commented 3 months ago

Hi @aaronsfox, my recent tests ended up with significant improvement in convergence time from 10 min (535 iter) to 6.5 min (347 iter) (4 threads) by adjusting the patellofemoral bounds from the default [-99999, -99999] to [0, 2.0944].

I'm still dealing with the location of the contact geometries. For example, the convergence time of a torque-driven marker + contact tracking simulation was 6, 18 and 24 min for 0mm, 5mm, and 10mm vertical offset, respectively, and this varied cross conditions/data. Also, I found that unlike the location of the contact spheres, their radius did not have much impact (I tested 10mm to 25mm).

By any chance, do you have any dataset with full-body (torso + lower limb) marker set including toes markers? The GRF is not tracked well at the end of the stance, and I assume this is due to lack of accurate mtp joint kinematics. graph_grf

Thank you in advance.

aaronsfox commented 3 months ago

@mrrezaie Interesting I've never messed too much with the PF coordinate bounds given it's a constrained joint - but I suppose the optimisation doesn't always have to respect the constraints, so bounding it to reasonable values makes sense.

I think I've developed some useful strategies around dealing with vertical position of contact spheres. There's no exact way to place these with respect to vertical position, so I use a consistent generic approach to place them and then implement a technique that seems to help. I often increase the pelvis_ty coordinate value in the initial guess by a small amount (e.g. 5-10cm) so that the model is effectively floating in the guess. I find this avoids issues with the spheres penetrating the ground too much to begin with and sometimes going through the ground and causing issues + it seems it can 'ease into' getting the contact tracking right.

I don't think I have another dataset that will help with the toes problem. I have seen a suggestion by Ross Miller that if you want to include coordinates in the simulation that your experimental data might not have the resolution to track (e.g. MTP, subtalar), then you can set the tracking targets to be 0 the whole time (i.e. neutral) but with a low tracking weight. The optimisation will then hopefully only deviate these joint angles slightly, but these deviations might help with more accurate contact tracking.

mrrezaie commented 3 months ago

Thanks @aaronsfox, I will definitely take the multi-segment foot markerset (rearfoot, forefoot, and toes) into account, if I get admitted to Deakin.

Have you ever tried Ground Contact Personalization toolbox? It doesn't seem to adjust the size and the location of the contact spheres. Does it? I think we will need to develop a similar calibration framework for this purpose. It can be very simple and kinematics-based only, use markers and do optimization to minimize the vertical distance between the markers and the ground during midstance where the foot is fully in contact with the ground; this would be less subjective. Or more dynamics-based approach. OpenSim Moco can likely handle it.

Regarding what I said on the forum about the GitHub Actions workflow, I really think it's worth a try. Whenever OpenSim developers adjust the core, the workflow complies/builds it for all three OSs (Windows, Mac, and Linux) and uploads in an hour. This is the latest one for example. Once you downloaded it, you need to install the Python package shipped with it locally. I always use these nightly releases rather than the official one.

aaronsfox commented 3 months ago

@mrrezaie I think I saw some presentations on the ground contact toolbox you've linked at the TGCS conference last year. It seems like an interesting approach. I haven't invested a lot of time in understanding the NMSM pipeline yet though as it requires the GPOPS solver which I can't access. They are working to integrate their platform more with OpenSim Moco and those solvers though, so it might be a useful tool in the future. I'm always a little sceptical around putting in the effort to optimise contact parameters and locations etc., as while it may get some improvement it doesn't always work that much better than a generic approach and adjusting the kinematics - it's more just a problem of figuring out how to set-up the problem to adjust the kinematics in a way that you get the good result. Moco has the option to include parameter optimisation of contact spheres in the problem but I've never had much success when I try it.

mrrezaie commented 3 months ago

Hi @aaronsfox, I was curious about something in your Dynamic Consistency Quest. The coordinates are so smooth and tracked well, and the residual actuations are quite low. These metrics reveal that the simulation is perfect. But, why do you think your reserve actuations (joint moments) are so noisy?

aaronsfox commented 3 months ago

@mrrezaie I'm not entirely sure but my suspicion is the mesh interval in the problem causes the torque controls to be a bit noisy. RRA uses a mesh interval of 0.001 I think and the tracking simulations in that paper were 0.01 from memory. Could probably confirm by testing a very small mesh interval. I have re run some trials from that paper and have improved the simulation time by a decent amount, and this may have generated smoother controls

mrrezaie commented 3 months ago

Thanks for your response. Your Moco solution shows that the time interval is almost 0.005, e.g. this, and I think this would be good enough, but the noisy kinetics is weird to me. The output would be much worse if you include contact tracking goal.

I have a similar problem with my torque-driven marker + contact tracking simulation. I tried many options, including finer mesh intervals or different goals weights and solver tolerances, but to no avail so far. This is a serious issue and must be resolved before including muscles, since the muscles will produce these bumpy joint moments exactly. This is something that has been overlooked by other researchers as well.

In my case, I found that the COPs were not tracked well, and this may propagate error to the proximal joints. But I have no idea how to improve it despite the GRFs look excellent. One explanation is that the one-segment foot model cannot reproduce the COP measured from a natural foot.

Also, I think that there must be an option to minimize the derivatives of the actuators/muscles controls. This would make the output less noisy/bumpy and help convergence. I will suggest it to the OpenSim dev team.

Our final simulation will have several criteria (coordinate tracking, contact tracking, control-effort, output, joint reaction, and other goals). Finding the best weight for each goal would be really tricky. Improving one output might deteriorate others, and there is no rule in this regard. I see no point in a well-converged simulation but with unphysiological outputs. For instance, to achieve dynamics consistency, coordinates are allowed to deviate, but the new kinematics could not be observed in the individual (comparing the models' markers with experimental data could confirm). Or, smooth and dynamically consistent kinematics are producing bumpy joint kinetics.

It's not likely possible to keep all outputs in a perfect condition unless I acknowledge the errors and the limitations; this is what I cannot cope with. I believe we are at the early ages of such predictive simulations, and I would prefer not to follow others' simulations blindly. The sensitivity analyses will likely take too much time.

Let me know your thoughts. Thanks!

aaronsfox commented 3 months ago

@mrrezaie I had some theories around this when working on the paper that I've now remembered from this discussion - and I think these somewhat align with Nick's response to you on the Moco forum. I noticed in that papers simulations that the residuals that remained in the RRA solution were quite noisy themselves - so my theory is that when that noise disappears and everything else stays similar (i.e. kinematics, external loads), that the noise then has to go somewhere (i.e. into the joint torques). This thought seems to align with what Nick is saying on the forum. My other thought was whether or not we should expect joint torques to be incredibly smooth. When we make these calculations from motion capture and GRF data, we do some heavy filtering to get nice and smooth joint torques in the end - so should we actually expect data to be realistically that smooth when we don't do that?

mrrezaie commented 3 months ago

Hi @aaronsfox, what you and Nick mentioned was interestingly fair. My point was that these bumpy joint moments could negatively affect the muscles function. This has been hidden in muscle-driven simulations, and that's why I'm conducting torque-driven simulations. If I end up with reasonable outputs, then it's time to include muscles.

OpenSim Moco is not perfect yet. As I said, it lacks the option for minimizing controls derivatives (ref), and I believe this could really improve this issue. I just requested this feature on the opensim-core repo, and really hope they add this feature.

Similarly, to implement minimizing states derivatives mentioned in the above ref, I added MocoOutputGoals with a relatively small weight of 1e-7 (still exaggerated) for minimizing squared coordinate accelerations (you may want to see the output here). It really helped smoothed the coordinates values and speeds in my marker tracking simulation. Please notice that I didn't apply any low-pass filter on the markers unlike the Moco examples; only the states for initial guesses were filtered at 15 Hz.

So, we can improve our results instead of accepting and justifying the errors/limitations.

My other thought was whether or not we should expect joint torques to be incredibly smooth. When we make these calculations from motion capture and GRF data, we do some heavy filtering to get nice and smooth joint torques in the end - so should we actually expect data to be realistically that smooth when we don't do that?

Well, the input data in DCQ are already well-smoothed as you said, otherwise I could make sense of the noisy joint moments. I'm not looking at the ID output as gold standard, and not comparing the results. I'm just looking for a way to fix it.

As you know, the movements of typically developed humans are dynamically consistent with optimal controls in nature. Due to modelling and experimental errors, residual actuations appear in our results. The commonly used approaches, i.e. RRA and DCQ, are minimizing these non-physiological actuations at the expense of altering kinematics and kinetics. So, such kinematics/markers deviations, and noisy/bumpy joint angles and moments, which occur after minimizing the residual actuations, do not actually exist and are not physiological. I'm just trying to reduce these errors as low as possible. Along with the modelling issues, there are also a few experimental errors as I said previously, and I aim to address them all in my PhD, if I get admitted.

Please let me know your thoughts. Thank you!!!

aaronsfox commented 3 months ago

@mrrezaie I had a few thoughts thinking about this discussion

1) I think it would be nice to have the minimising control derivatives goal. You could try having a go at it yourself even by altering the OpenSim core code. I have no real experience or learning in C++ but am learning on the fly and it's mainly by using existing examples in the source code to build my own tools. Your feature request might be an OK one to work on as you could model it off the accelerations derivatives goal. Coding in C++ is not a pleasant experience, but I've realised by learning it I can extend the capacity of OpenSim with plugins etc. to help my own work.

2) I got some advice from a seasoned simulation veteran one time about how trying to get a torque-driven simulation to work and shift across to a muscle-driven simulation isn't always a great idea. You'll possibly end up with a great torque-driven solution that isn't feasible in a muscle-driven context as the way these controls/actuators behave is inherently different. Their advice was to work within the framework that you plan to finish in - which can be more time consuming in practice, but if you spend all this time getting a torque-driven solution that doesn't work with muscles then it's probably more time wasted.

3) This point is a discussion I end up having with all research students who do modelling and simulation work, is that you eventually get to a point where you realise that models aren't perfect and quite often when you read modelling papers there are likely errors/issues under the surface (e.g. noisy joint torques) that get brushed aside or ignored. You can take two paths in tackling this problem - the first is where you're at now in spending a lot of time battling to try and get your model outputs perfect, and you could get there but it will be a long slog. The other path is to decide how much error you're willing to deal with and in what aspect of the outputs, and then subsequently interpret your results accordingly. It would be nice if everything worked perfectly all the time, but that's not really going to happen ๐Ÿ˜„