haosulab / ManiSkill

SAPIEN Manipulation Skill Framework, a GPU parallelized robotics simulator and benchmark
https://maniskill.ai/
Apache License 2.0
856 stars 154 forks source link

[Question] Solving Pick-Cube from Pixels Only #667

Open SumeetBatra opened 18 hours ago

SumeetBatra commented 18 hours ago

Hey! I wanted to see if you guys had any reference code / hyperparameters for SAC solving any of the tabletop tasks using RGB(D) data only and no proprioceptive state information. Thanks!

StoneT2000 commented 18 hours ago

Sorry we have not tuned SAC at the moment, only PPO with some proprioception data + one RGB camera. There is some example code with state based SAC, a simple vision based one will come eventually. TD-MPC2 is already integrated and supports learning from pixels, does need much tuning.

If there's a lot of value in testing algorithms with visual only inputs we can try and help set it up in the future, we have some DM control environments benchmarked with PPO with an option to use visual only inputs.

SumeetBatra commented 17 hours ago

I see, thanks for letting me know! I think having some baselines of end-to-end pixel to action policies would be useful. I am currently using SAC for my project but may also try out other algos in the future.

StoneT2000 commented 15 hours ago

Is GPU parallelization important in your case? Or are you working more on e.g. sample-efficiency. I can have some members on the team look to try and tune a RGB/RGBD SAC version.

SumeetBatra commented 15 hours ago

It's not important, but if it makes policy convergence faster I'm for GPU parallelization. Sample efficiency is not an issue atm. I appreciate you all looking into this!