ksengin / active-target-localization

Active Localization of Multiple Targets Using Noisy Relative Measurements
5 stars 2 forks source link

Using FIM Reward #1

Open gauravkuppa opened 2 years ago

gauravkuppa commented 2 years ago

How can I use FIM reward? I get these errors.

When I run python -m target_localization.train --sess dynamic_target --num_targets 2 I get

Traceback (most recent call last):
  File "/Users/gauravkuppa/anaconda3/envs/atl_venv/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/gauravkuppa/anaconda3/envs/atl_venv/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/train.py", line 145, in <module>
    main(get_args())
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/train.py", line 78, in main
    next_state, reward, done, reward_info = env.step(action)
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/envs/tracking_waypoints_env.py", line 116, in step
    self.predictions_flat = self.predictions.reshape(-1)
AttributeError: 'TrackingWaypointsEnvInterface' object has no attribute 'predictions'

When I run python -m target_localization.train --sess dynamic_target --num_targets 2 --no_augmented_state I get

Traceback (most recent call last):
  File "/Users/gauravkuppa/anaconda3/envs/atl_venv/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/gauravkuppa/anaconda3/envs/atl_venv/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/train.py", line 145, in <module>
    main(get_args())
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/train.py", line 78, in main
    next_state, reward, done, reward_info = env.step(action)
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/envs/tracking_waypoints_env.py", line 112, in step
    reward, done = self.compute_reward()
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/envs/tracking_waypoints_env.py", line 179, in compute_reward
    reward = self.info_acc.fisher_determinant(self.sensor_pos)
  File "/Users/gauravkuppa/LitReview_MAS/active-target-localization/target_localization/util/info_accumulator.py", line 24, in fisher_determinant
    fisher_contrib = torch.sin(self.angles - angles)**2 / (self.distances**2 * dists**2)
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 1

How can I use FIM reward to train a tracking policy?

ksengin commented 2 years ago

Hi Gaurav, Thank you for bringing this to my attention! Please note that the FIM determinant as the reward function was an experimental work, and I haven't tested it fully. You can use it either by assigning self.info_acc = None in L132 of envs/tracking_waypoints_env.py, or you can pull the latest version and use the following command: python -m target_localization.train --sess test_session --num_targets 2 --reward_type fim --no_augmented_state.

I will update the codebase later to handle the errors when using the FIM determinant rewards with the information accumulator.