thu-coai / ConvLab-2

ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
Apache License 2.0
449 stars 130 forks source link

[Feature] Update MultiWOZ dataset from 2.1 to 2.2 #55

Open derekchen14 opened 4 years ago

derekchen14 commented 4 years ago

Given the release of MultiWoZ 2.2, it seems like the baselines should all be retrained using the cleanest version of the dataset. Paper: https://www.aclweb.org/anthology/2020.nlp4convai-1.13/

zqwerty commented 4 years ago

Thanks! We've noticed MultiWOZ 2.2. We will add it if it is of high quality

chris-boson commented 3 years ago

Also would be great to support the new format (which will also make it easy to add SGD).

zqwerty commented 3 years ago

We are planning to add many datasets (SchemaGuided, Taskmaster, etc.) using a unified format.

tomolopolis commented 3 years ago

great that you're planning to add SGD and Taskmaster, any updates on when that will be available

zqwerty commented 3 years ago

Actually, we have processed SGD, Taskmaster, and other datasets. We will update them with MultiWOZ 2.2 & 2.3 in few days. Thanks!

tomolopolis commented 3 years ago

great stuff - looking forward to it!

zqwerty commented 3 years ago

@tomolopolis SGD and Taskmaster are available in unified format #180.

tomolopolis commented 3 years ago

@zqwerty thanks for that, are there plans to replicate (some) of the existing supported model implementations to use the unified format? then have the various datasets configurable in each model, given the consistent format?

For example some new modules might be: convlab2/nlu/jointBERT/unified/nlu.py convlab2/dst/comer/unified/dst.py convlab2/policy/gdpl/unified/policy.py convlab2/nlg/sclstm/unified/nlg.py ...

zqwerty commented 3 years ago

@tomolopolis we will modify the unified data process and support some of the useful models. However, some models have a lot of dataset-specific processes which can not be well unified.

zqwerty commented 3 years ago

@tomolopolis we have added multiwoz 2.2 and multiwoz-coref. Check 34960ff in master. However, I deleted the previous commit in order to remove git lfs due to the limited bandwidth for download. I've noticed that you have merged the previous pull-request. Hope that will not bother you too much.

tomolopolis commented 3 years ago

@zqwerty Thanks for adding those. No worries about deleting the previous commit, I can pull in the latest