This repository is the official implementation of Novel Slot Detection: A Benchmark for Discovering Unknown Slot Types in the Task-Oriented Dialogue System.(ACL2021) by Yanan Wu, [Zhiyuan Zeng](), Keqing He, Hong Xu, Yuanmeng Yan, Huixing Jiang, Weiran Xu.
The Benchmark for Discovering Unknown Slot Types in the Task-Oriented Dialogue System.
An example of Novel Slot Detection in thetask-oriented dialogue system:
The architecture of the proposed model:
We use anaconda to create python environment:
conda create --name python=3.6
Install all required libraries:
pip install -r requirements.txt
python --mode train --dataset SnipsNSD5% --threshold 8.0 --output_dir ./output --batch_size 256 --cuda 1
python --mode test --dataset SnipsNSD5% --threshold 8.0 --output_dir ./output --batch_size 256 --cuda 1
python --mode both --dataset SnipsNSD5% --threshold 8.0 --output_dir ./output --batch_size 256 --cuda 1
mode
, optional, Specify running mode, only train
,onlytest
or both
.dataset
, required, The dataset to use, SnipsNSD5%
or SnipsNSD15%
or SnipsNSD30%
.threshold
, required, The specified threshold value.output_dir
, default="./output"batch_size
, default=256cuda
, default=1
5% | 15% | 30% | |||||||||
Models | IND | NSD | IND | NSD | IND | NSD | |||||
detection method | objective | distance strategy | Span F1 | Span F1 | Token F1 | Span F1 | Span F1 | Token F1 | Span F1 | Span F1 | Token F1 |
MSP | binary | - | 87.21 | 12.34 | 25.16 | 71.44 | 12.31 | 39.50 | 58.88 | 8.73 | 40.38 |
MSP | multiple | - | 88.05 | 14.04 | 30.50 | 79.71 | 20.97 | 40.02 | 78.52 | 25.26 | 46.91 |
MSP | binary+multiple | - | 89.59 | 23.58 | 37.55 | 83.72 | 24.70 | 45.32 | 79.08 | 30.66 | 52.10 |
GDA | binary | difference | 87.95 | 23.83 | 35.83 | 83.65 | 22.06 | 43.99 | 78.72 | 32.50 | 44.13 |
GDA | binary | minumum | 61.29 | 10.36 | 17.08 | 49.11 | 16.91 | 31.10 | 48.07 | 15.56 | 33.78 |
GDA | multiple | difference | 93.14 | 29.73 | 45.99 | 90.07 | 31.96 | 53.02 | 85.56 | 36.16 | 54.55 |
GDA | multiple | minumum | 93.10 | 31.67* | 46.97* | 90.18 | 32.19 | 53.75* | 86.26* | 38.64* | 55.24* |
5% | 15% | 30% | |||||||||
Models | IND | NSD | IND | NSD | IND | NSD | |||||
detection method | objective | distance strategy | Span F1 | Span F1 | Token F1 | Span F1 | Span F1 | Token F1 | Span F1 | Span F1 | Token F1 |
MSP | binary | - | 92.04 | 19.73 | 29.63 | 91.74 | 23.40 | 33.89 | 80.49 | 21.88 | 39.17 |
MSP | multiple | - | 94.33 | 27.15 | 31.16 | 92.54 | 39.88 | 42.29 | 87.63 | 40.42 | 47.64 |
MSP | binary+multiple | - | 94.41 | 32.49 | 43.48 | 93.29 | 41.23 | 43.13 | 90.14 | 41.76 | 51.87 |
GDA | binary | difference | 93.69 | 27.02 | 34.21 | 92.13 | 30.51 | 36.30 | 88.73 | 30.91 | 45.64 |
GDA | binary | minumum | 93.57 | 15.90 | 20.96 | 90.98 | 24.53 | 27.26 | 88.21 | 26.40 | 39.83 |
GDA | multiple | difference | 95.20 | 47.78* | 51.54* | 93.92 | 50.92* | 52.24* | 92.02 | 51.26* | 56.59* |
GDA | multiple | minumum | 95.31* | 41.74 | 45.91 | 93.88 | 43.78 | 46.18 | 91.67 | 45.44 | 52.37 |
@article{Wu2021NovelSD,
title={Novel Slot Detection: A Benchmark for Discovering Unknown Slot Types in the Task-Oriented Dialogue System},
author={Yanan Wu and Zhiyuan Zeng and Keqing He and Hong Xu and Yuanmeng Yan and Huixing Jiang and Weiran Xu},
journal={ArXiv},
year={2021},
volume={abs/2105.14313}
}
Q:There are two training objectives mentioned in Section 4.1: multiple classifier and binary classifier. But if we use binary classifier, how can we get the ind category? And how to get the results of MSP + binary and GDA + binary?
A:As we mention in Section4.1—— "In the test stage, for in-domain prediction, we both use the multiple classifier. While, for novel slot detection, we use the multiple classifier or the binary classifier, or both of them". It means binary classifier won't be used for gaining the fine in-domain labels, but for detecting whether a token is a novel slot, and if yes, we will override the fine in-domain labels gained by multiple classifier.