AnandSingh-0619 / home-robot

Mobile manipulation research tools for roboticists
MIT License
0 stars 0 forks source link

Yolo sensor initialization and value updation #3

Closed AnandSingh-0619 closed 7 months ago

AnandSingh-0619 commented 7 months ago

https://github.com/AnandSingh-0619/home-robot/blob/79a7742ed4855482bf5cdd6a06429b3d5bea973a/projects/habitat_uncertainity/task/sensors.py#L105C1-L105C57

  1. The YoloPerception class object is created for all the environments individually. Earlier I had issue with initialization of this class which was some internal error of the Yoloworld class.This error is now sorted.

  2. Another issue I am facing now is this . The observation space shared between the sensors in an env only has readings of head_depth and head_panoptic sensor. I have now added head_rgb too. Now if the YOLO_sensor is giving masks, it cannot be shared by other sensors as observation space consists of simulator-> agents-> main_agent-> sim_sensors only. I have made some modification to he mask generation code from our last discussion. Now I am replicating the work of panoptic sensor and labeling each pixel to the corresponding class. In order to share information between sensors i am currently using the space of head_Depth sensor and updating it with mask value.

  3. Now there is no error in code while debugging it for train run type. However I get CUDA out of memory error after running the code for some time. I want to run a batch job but there is some difference in command given by ovmm readme and the format in which run.py is expecting input especially regarding skill to be trained for. python -u -m habitat_baselines.run \ --exp-config habitat-baselines/habitat_baselines/config/ovmm/rl_skill.yaml \ --run-type train benchmark/ovmm=<skill_name> \ habitat_baselines.checkpoint_folder=data/new_checkpoints/ovmm/<skill_name>

  4. Can you please check my code and help me setting up a batch job?

yusufali98 commented 7 months ago

As discussed offline, using YOLO_sensor was basically replicating the model for each environment which is blowing up GPU VRAM usage and we decided to shift the detector+segmentor code to the policy level which is basically shared by all environments. This will ensure the detector/segmentor is only instantiated once which is a more practical setting

Also, shared the batch job script offline which I think works ?

You can add in comments if you made any other changes as well. If not, please close the issue