Bug in mm_dst baseline - Githubissues

facebookresearch / simmc

With the aim of building next generation virtual assistants that can handle multimodal inputs and perform multimodal actions, we introduce two new datasets (both in the virtual shopping domain), the annotation schema, the core technical tasks, and the baseline models. The code for the baselines and the datasets will be opensourced.

Other

131 stars 36 forks source link

https://github.com/facebookresearch/simmc/blob/36e53ddd7256304c77e073c7e2c3503ec2d3e86e/mm_dst/gpt2_dst/utils/convert.py#L225 Here .-: is a range of characters, does not include -.

>>> slot_regex = re.compile(r'([A-Za-z0-9_.-:]*)  *= ([^,]*)')
>>> slot_regex.findall('furniture-O = OBJECT_0')
[('O', 'OBJECT_0')]

Here is a fix.

>>> slot_regex = re.compile(r'([A-Za-z0-9_.:-]*)  *= ([^,]*)')
>>> slot_regex.findall('furniture-O = OBJECT_0')
[('furniture-O', 'OBJECT_0')]

This seemed fine when both target and prediction were parsed using this function, but will cause issue if someone use this function to format output into json and evaluate using the new evaluation script.

facebookresearch / simmc

Bug in mm_dst baseline #39