jstmn / ikflow

Open source implementation to the paper "IKFlow: Generating Diverse Inverse Kinematics Solutions"
https://sites.google.com/view/ikflow/home
Other
51 stars 5 forks source link

Encountered Nan problem while training a new robot #13

Closed ZhengmaoHe closed 3 months ago

ZhengmaoHe commented 3 months ago

Hi, Thank you for your excellent work!

I am trying to use ikflow in my project, and my robot is different from a regular robotic arm. It has three additional floating joints, so it has a total of 9 degrees of freedom.

I often encounter the problem of loss being NaN around the 2nd to 5th epochs now. I saw that you also set a warning of loss being NaN in two places. Do you have any relevant suggestions?

ZhengmaoHe commented 3 months ago

image This is the loss curve

jstmn commented 3 months ago

Hi!

I'd first change nb_nodes to 12 - this will give you greater model capacity which improves performance and can also make training more stable (but also take longer, fyi).

Next, try reducing the learning rate. You can try 3.75*1e-4, 2.5*1e-4, 1.25*1e-4, 1e-5

Let me know how that goes!

ZhengmaoHe commented 3 months ago

Thank you, Jeremy! My training results are very good, you have been a big help to me! btw, your code comments are very cute, making my coding less tedious :)

jstmn commented 3 months ago

Glad to hear it! What are you using IKFlow for if you don't mind me asking?

ZhengmaoHe commented 3 months ago

I use it for a loco-manipulation project, and when I complete it better, I will share more details with you!

jstmn commented 3 months ago

Nice! sounds exciting.

If I could ask one more follow up - to do training have you created a new Robot subclass in the jrl package? Or have you changed the code around so you just use a presaved dataset of configuration/EE pose pairs

if it's the former (a new Robot subclass), would you consider adding it to the jrl package?

ZhengmaoHe commented 3 months ago

I am very sorry for my delayed reply. I previously implemented a class based on pytorch_kinematic for my project, which can calculate the forward and inverse kinematics of floating base robots. It is simply a transformation on the solution of a fixed base robot. Then, in order to adapt to ikflow, I also implemented methods such as save_dataset_to_disk and solution_pose_errors etc, and used my custom class to replace the Robot in the code.

So I'm sorry, I didn't make any valuable contribution to jrl package. It is a great tool that implements more features than pytorch_kinematic, thank you for your efforts!

jstmn commented 3 months ago

Got it, makes sense. Thanks, this helps me understand how others are using the code.