For ICM, we will need to have the trajectory buffer store the next observation in addition to the current one. This complicates things a bit as we will now have 4 different versions of an observation:
The current full observation
The region-of-interest of the observation
The next full observation
The region-of-interest of the next observation
This makes me revisit the idea that we need to have the region of interest computation outside of the agent. Maybe we can keep it inside somehow?
For ICM, we will need to have the trajectory buffer store the next observation in addition to the current one. This complicates things a bit as we will now have 4 different versions of an observation:
This makes me revisit the idea that we need to have the region of interest computation outside of the agent. Maybe we can keep it inside somehow?