Completed OpenXModule class for zero-shot evaluation.
Each batch has multiple episodes, _process_batch function is a generator to produce the batch inputs including one timestep from each episode.
Since the number of timesteps in each episodes can vary, the results of _process_batch can have difference sizes if some episodes are already finished.
It generates the batch until all episodes reach the end.
The processed inputs are passed to VLMModule.
The metrics are success rate and MSE scores.
Refactored the dataloader for OpenX to remove the JAT-specific parts.
TODO: This should be integrated into one utility class in the future.
_TODO: text_observation and multiple vectors should be considered later._
GPT-4 sometimes generates malformed results due to the limitation of VLMs. If this happens, the action is set to just random vectors.
@pranavguru to fix some parts of src/data_utils/openx_dataloader.py to reflect the latest version of the OpenX dataloader.
The dataloader currently concatenates all the float tensor continuous observations in the observation dict that is part of each timestep of a given OpenX dataset. Either this concatenation should be removed or the prompt engineering framework needs to be modified to include information regarding this
Text observation data needs to be integrated into the prompt being passed to the model
OpenXModule
class for zero-shot evaluation._process_batch
function is a generator to produce the batch inputs including one timestep from each episode._process_batch
can have difference sizes if some episodes are already finished.VLMModule
.text_observation
and multiple vectors should be considered later._