Closed leoozy closed 3 months ago
Thanks @leoozy. The previous actions could be mapped from action_reprs, which is the full set of actions for a certain task. MMind2web was originally designed to make step-wise actions easier, so didn't add the orders.
We have updated the Multimodal-Mind2Web in Huggingface: https://huggingface.co/datasets/osunlp/Multimodal-Mind2Web Each row of data comes with target_action_index for the index of the target action and the corresponding action representation string in target_action_reprs.
Thank you very much for your work. I have found a potential bug in your MM-Mind2web model. It seems that each data point only contains a list of selectable actions without any previous actions. This could lead to issues during evaluation.