This repository contains a list of papers, codes, datasets, leaderboards in SLU field. If you found any error, please don't hesitate to open an issue or pull request.
If you find this repository helpful for your work, please kindly cite the following paper. The Bibtex are listed below:
@misc{qin2021survey, title={A Survey on Spoken Language Understanding: Recent Advances and New Frontiers}, author={Libo Qin and Tianbao Xie and Wanxiang Che and Ting Liu}, year={2021}, eprint={2103.03095}, archivePrefix={arXiv}, primaryClass={cs.CL} }
Tutorial Presented by Libo Qin, Wanxiang Che,Zhou Yu.
Resource Contributed by Libo Qin, Tianbao Xie, Yudi Zhang, Lehan Wang, Wanxiang Che,Zhou Yu.
Spoken language understanding (SLU) is a critical component in task-oriented dialogue systems. It usually consists of intent and slot filling task to extract semantic constituents from the natrual language utterances.
For the purpose of alleviating pressure in article/dataset collation, we worked on sorting out the relevant data sets, papers, codes and lists of SLU in this project.
At present, the project has been completely open source, including:
The taxonomy and frontiers of our survey can be summarized into this picture below.
arxiv
[pdf]book
[pdf]COLING 2020
[pdf] arxiv 2021
[pdf] ACL 2020
[pdf] [code] COLING 2018
[pdf] [code] ICASSP 2021
[pdf] [code]EMNLP 2020
[pdf] [code]ACL 2019
[pdf] [code] arXiv 2019
[pdf] [code] ACL 2019
[pdf] [code] EMNLP 2019
[pdf] [code] NAACL 2018
[pdf] [code] SIGDIAL 2016
[pdf] [code]NAACL 2018
[pdf] [code] IJCNLP 2017
[pdf] [code] IEEE 2017
[pdf] [code] arXiv 2020
[pdf] [code] EMNLP 2019
[pdf] [code] ACL 2020
[pdf] [code]IJCAI 2020
[pdf] [code] EMNLP 2020
[pdf] [code]AAAI 2020
[pdf] [code] arXiv 2020
[pdf] [[code]]() ACL 2017
[pdf] [code] AAAI 2021
[pdf] [code] ACL 2020
[pdf] [code]AAAI 2022
[pdf] [code]Name | Intro | Links | Multi/Single Turn(M/S) | Detail | Size & Stats | Label |
---|---|---|---|---|---|---|
ATIS |
1. The ATIS (Airline Travel Information Systems) dataset (Tur et al., 2010) is widely used in SLU research 2. For natural language understanding | Download: 1.https://github.com/yizhen20133868/StackPropagation-SLU/tree/master/data/atis 2.https://github.com/yvchen/JointSLU/tree/master/data Paper: https://www.aclweb.org/anthology/H90-1021.pdf | S | Airline Travel Information However, this data set has been shown to have a serious skew problem on intent | Train: 4478 Test: 893 120 slot and 21 intent | Intent Slots |
SNIPS |
1. Collected by Snips for model evaluation. 2. For natural language understanding 3. Homepage: https://medium.com/snips-ai/benchmarking-natural-language-understanding-systems-google-facebook-microsoft-and-snips-2b8ddcf9fb19 | Download: https://github.com/snipsco/nlu-benchmark/tree/master/2017-06-custom-intent-engines Paper: https://arxiv.org/pdf/1805.10190.pdf | S | 7 task: Weather,play music, search, add to list, book, moive | Train:13,084 Test:700 7 intent 72 slot labels | Intent Slots |
Facebook Multilingual SLU Dataset |
1 Contains English, Spanish, and Thai across the weather, reminder, and alarm domains 2 For cross-lingual SLU | Download: https://fb.me/multilingual_task_oriented_data Paper: https://www.aclweb.org/anthology/N19-1380.pdf | S | Utterances are manually translated and annotated | Train: English 30,521; Spanish 3,617; Thai 2,156 Dev: English 4,181; Spanish 1,983; Thai 1,235 Test: English 8,621; Spanish 3,043; Thai 1,692 11 slot and 12 intent | Intent Slots |
MIT Restraunt Corpus |
MIT corpus contains train set and test set in BIO format for NLU | Download: https://groups.csail.mit.edu/sls/downloads/restaurant/ | S | It is a single-domain dataset, which is associated with restaurant reservations. MR contains âopen-vocabularyâ slots, such as restaurant names | Train:7760 Test:1521 | Slots |
MIT Movie Corpus |
The MIT Movie Corpus is a semantically tagged training and test corpus in BIO format. The eng corpus are simple queries, and the trivia10k13 corpus are more complex queries. | Download: https://groups.csail.mit.edu/sls/downloads/movie/ | S | The MIT movie corpus consists of two single-domain datasets: the movie eng (ME) and movie trivia (MT) datasets. While both datasets contain queries about film information, the trivia queries are more complex and specific | eng Corpus: Train:9775 Test:2443 Trivia Corpus: Train:7816 Test:1953 | Slots |
Multilingual ATIS |
ATIS was manually translated into Hindi and Turkish | Download: It has been put into LDC, and you can download it if you are own a membership or pay for it Paper: http://shyamupa.com/papers/UFTHH18.pdf | S | 3 languages | On the top of ATIS dataset, 893 and 715 utterances from the ATIS test split were translated and annotated for Hindi and Turkish evaluation respectively also translated and annotated 600(each language separately) utterances from the ATIS train split to use as supervision In total 37,084 training examples and 7,859 test examples | Intent Slots |
Multilingual ATIS++ |
Extends Multilingual ATIS corpus to nine languages across four language families | Download: contact multiatis@amazon.com. Paper: https://arxiv.org/abs/2004.14353 | S | 10 languages | check the paper to find the full table of description (to many info ,have no enough space here) | Intent Slots |
Almawave-SLU |
1. A dataset for Italian SLU 2. Was generated through a semi-automatic procedure from SNIPS | Download: contact [first name initial\].[last name]@almawave.it for the dataset (any author in this paper) Paper: https://arxiv.org/pdf/1907.07526.pdf | S | 6 domains: Music, Restaurants, TV, Movies, Books, Weather | Train: 7,142 Validation: 700 Test: 700 7 intents and 39 slots | Intent Slots |
Chatbot Corpus |
1. Chatbot Corpus is based on questions gathered by a Telegram chatbot which answers questions about public transport connections, consisting of 206 questions 2. For intent classification test | Download: https://github.com/sebischair/NLU-Evaluation-Corpora Paper: https://www.aclweb.org/anthology/W17-5522.pdf | S | 2 Intents: Departure Time, Find Connection 5 entity types: StationStart, StationDest, Criterion, Vehicle, Line | Train: 100 Test: 106 | Intent Entity |
StackExchange Corpus |
1. StackExchange Corpus is based on data from two StackExchange platforms: ask ubuntu and Web Applications 2. Gathers 290 questions and answers in total, 100 from Web Applications and 190 from ask ubuntu 3. For intent classification test | Download: https://github.com/sebischair/NLU-Evaluation-Corpora Paper: https://www.aclweb.org/anthology/W17-5522.pdf | S | Ask ubuntu Intents: âMake Updateâ, âSetup Printerâ, âShutdown Computerâ, and âSoftware Recommendationâ Web Applications Intents: âChange Passwordâ, âDelete Accountâ, âDownload Videoâ, âExport Dataâ, âFilter Spamâ, âFind Alternativeâ, and âSync Accountsâ | Total: 290 Ask ubuntu: 190 Web Application: 100 | Intent Entity |
MixSNIPS/MixATIS |
multi-intent dataset based on SNIPS and ATIS | Download: https://github.com/LooperXX/AGIF/tree/master/data Paper: https://www.aclweb.org/anthology/2020.findings-emnlp.163.pdf | S | using conjunctions, connecting sentences with different intents forming a ratio of 0.3,0.5 and 0.2 for sentences has which 1,2 and 3 intents, respectively | Train:12,759 utterances Dev:4,812 utterances Test:7,848 utterances | Intent(Multi),Slots |
TOP semantic parsing |
1,Hierarchical annotation scheme for semantic parsing 2,Allows the representation of compositional queries 3,Can be efficiently and accurately parsed by standard constituency parsing models | Download: http://fb.me/semanticparsingdialog Paper: https://www.aclweb.org/anthology/D18-1300.pdf | S | focused on navigation, events, and navigation to events evaluation script can be run from evaluate.py within the dataset | 44783 annotations Train:31279 Dev:4462 Test:9042 | Inten ,Slots in Tree format |
MTOP: Multilingual TOP |
1.An almost-parallel multilingual task-oriented semantic parsing dataset covering 6 languages and 11 domains. 2.the first multilingual dataset that contain compositional representations that allow complex nested queries. 3.the dataset creation: i) generating synthetic utterances and annotating in English, ii) translation, label transfer, post-processing, post editing and filtering for other languages | Download: https://fb.me/mtop_dataset Paper: https://arxiv.org/pdf/2008.09335.pdf | S | 6 languages (both high and low resource): English, Spanish, French, German, Hindi and Thai. a mix of both simple as well as compositional nested queries across 11 domains, 117 intents and 78 slots. | 100k examples in total for 6 languages. Roughly divided into 70:10:20 percent splits for train,eval and test. | Two kinds of representations: 1.flat representatiom: Intent and slots 2.compositional decoupled representations:nested intents inside slots More details 3.2 section in the paper |
CAIS |
Collected from real world speaker systems with manual annotations of slot tags and intent labels | [https://github.com/Adaxry/CM-Net](https://github.com/Adaxry/CM-Net/tree/master/CAIS) | S | 1.The utterances were collected from the Chinese Artificial Intelligence Speakers 2.Adopt the BIOES tagging scheme for slots instead of the BIO2 used in the ATIS 3.intent labels are partial to the PlayMusic option | Train: 7,995 utterances Dev: 994 utterances Test: 1024 utterances | slots tags and intent labels |
Simulated Dialogues dataset |
machines2machines (M2M) | Download: https://github.com/google-research-datasets/simulated-dialogue Paper: http://www.colips.org/workshop/dstc4/papers/60.pdf | M | Slots: Sim-R (Restaurant) price_range, location, restaurant_name, category, num_people, date, time Sim-M (Movie) theatre_name, movie, date, time, num_people Sim-GEN (Movie):theatre_name, movie, date, time, num_people | Train: Sim-R:1116 Sim-M:384 Sim-GEN:100k Dev: Sim-R:349 Sim-M:120 Sim-GEN:10k Test: Sim-R:775 Sim-M:264 Sim-GEN:10k | Dialogue state User's act,slot,intent System's act,slot |
Schema-Guided Dialogue Dataset(SGD) |
dialogue simulation(auto based on identified scenarios), word-replacement and human intergration as paraphrasing | Download: https://github.com/google-researchdatasets/dstc8-schema-guided-dialogue Paper: https://arxiv.org/pdf/1909.05855.pdf | M | domains:16,dialogues:16142,turns:329964,acg turns per dialogue:20.44,total unique tokens:30352,slots:214,slot values:14319 | NA | Scheme Representation: service_name;description;slot's name,description,is_categorial,possible_values;intent's name,description,is_transactional,required_slots,optional_slots,result_slots. Dialogue Representation: dialogue_id,services,turns,speaker,utterance,frame,service,slot's name,start,exclusive_end;action's act,slot,values,canonical_values;service_call's method,parameters;service_results,state's active_intent,requested_slots,slot_values |
CLINC150 |
A intent classification (text classification) dataset with 150 in-domain intent classes. The main purpose of this dataset is to evaluate various classifiers on out-of-domain performance. | Download: https://archive.ics.uci.edu/ml/datasets/CLINC150 Paper: https://www.aclweb.org/anthology/D19-1131/ | S | data_full.json: 150 in-domain intent classes 100 train, 20 val, and 30 test samples while out-of-domain 100 train, 100 val, and 1,000 test samples, data_small.json: in-domain 50 train, 20 val, and 30 test, out-domain 100 train, 100 val, and 1,000 test samples. data_imbalanced.json: in-domain intent classes 25, 50, 75, or 100 train, 20 val, and 30 samples while out-of-domain class has 100 train, 100 val, and 1,000 test samples. data_oos_plus.json: same as data_full.json except there are 250 out-of-domain training samples. | size 23700 intent 150 | Intent(in-domain, out-domain) |
HWU64 |
Download: https://github.com/xliuhw/NLU-Evaluation-Data Paper: https://arxiv.org/pdf/1903.05566.pdf | S | 21 domains,inter alia,music, news,calendar | size 25716, intents 64, slots 54 | Intent detection;Entity extraction | |
Banking-77 |
BANKING77 dataset provides a very fine-grained set of intents in a banking domain. It comprises 13,083 customer service queries labeled with 77 intents. It focuses on fine-grained single-domain intent detection. | Download: github.com/PolyAI-LDN/polyai-models Paper: https://arxiv.org/pdf/2003.04807.pdf | S | banking | size 13083 intents 77 | Intent detection |
Restaurants-8K |
A new challenging data set of 8,198 utterances, compiled from actual conversations in the restaurant booking domain. | Download: https://github.com/PolyAI-LDN/task-specific-datasets Paper: https://arxiv.org/pdf/2005.08866.pdf | S | restaurant booking | size 11929 slots 5 | Slot filling |
ATIS in Chinese and Indonesian |
ATIS semantic dataset annotated in two new languages | Download: http://statnlp.org/research/sp/ Paper: https://www.aclweb.org/anthology/P17-2007.pdf | S | airline travels | size 5371 slot 120(166;lambda-calculus) | Semantic parsing; Slot filling |
Vietnamese ATIS |
Download : https://github.com/VinAIResearch/JointIDSF Paper : https://arxiv.org/pdf/2104.02021.pdf | S | airline travels | size 5871 intent 25 slot 120 | Intent detection, Slot filling. | |
xSID |
Translation of part of facebook and snips dataset | Download : https://bitbucket.org/robvanderg/xsid Paper : https://aclanthology.org/2021.naacl-main.197.pdf | S | Languages: Arabic, Danish, South-Tyrolean, German, English, Indonesian, Italian, Japanese, Kazakh, Dutch, Serbian, Turkish, Chinese. Intents: AddToPlaylist, BookRestaurant, PlayMusic, RateBook, SearchCreativeWork, SearchScreeningEvent, alarm/cancel_alarm, alarm/modify_alarm, alarm/set_alarm, alarm/show_alarms, alarm/snooze_alarm, reminder/cancel_reminder, reminder/set_reminder, reminder/show_reminders, weather/find. | 500 test, 300 dev for each language. 43605 English train (automatic translation into all languages also provided) | Intent detection, Slot filling. |
ProSLU |
Profile-based Spoken Language Understanding (Profile SLU) requires the model that not only relies on the plain text but also the supporting profile information to predict the correct intents and slots. | Download : https://github.com/LooperXX/ProSLU/tree/master/data/ProSLU Paper : https://ojs.aaai.org/index.php/AAAI/article/view/21411 | S | A large-scale human-annotated Chinese dataset with over 5K utterances and their corresponding supporting profile information: Knowledge Graph (KG), User Profile (UP), Context Awareness (CA). Experiments on various vanilla SLU baselines, and a general Profile SLU model with the multi-level knowledge adapter are provided. | Train: 4196, Dev: 522, Test: 531; intent 14, slot 99. | Intent detection, Slot filling. |
ACL 2020
[pdf] [code] ICASSP 2019
[pdf] IEEE Signal Processing Letters 2019
[pdf] EMNLP 2019
[pdf] COLING 2018
[pdf] ACL 2018
[pdf] COLING 2018
[pdf] [code] ICASSP 2017
[pdf] AAAI 2017
[pdf] IEEE 2016
[pdf] INTERSPEECH 2016
[pdf] IEEE Workshop on Spoken Language Technology 2016
[pdf] ICASSP 2016
[pdf] EMNLP 2016
[pdf] INTERSPEECH 2016
[pdf] IEEE/ACM TASLP 2015
[pdf] IEEE/ACM Transactions on Audio, Speech, and Language Processing 2015
[pdf] - 2015
[pdf] IEEE 2014
[pdf] IEEE 2014
[pdf] INTERSPEECH 2013
[pdf] ISCA 2013
[pdf] INTERSPEECH 2013
[pdf] EMNLP 2018
[pdf] IEEE Acess 2020
[pdf] arXiv 2018
[pdf] ACL 2019
[pdf] IJCNN 2019
[pdf] Multimedia Tools and Applications 2018
[pdf] InterSpeech 2018
[pdf] INTERSPEECH 2015
[pdf] ACL 2018
[pdf] ISCA 2013
[pdf] SIGDIAL 2019
[pdf] SIGDIAL 2019
[pdf]SIGDIAL 2018
[pdf] IJCAI 2016
[pdf] SIGDIAL 2016
[pdf] [code]INTERSPEECH 2016
[pdf] INTERSPEECH 2016
[pdf] INTERSPEECH 2016
[pdf] IEEE SLT 2014
[pdf] IEEE Workshop on Automatic Speech Recognition and Understanding 2013
[pdf] ICME 2021
[pdf] ICASSP 2021
[pdf] [code]EMNLP 2020
[pdf] [code]AAAI 2020
[pdf] ACL 2019
[pdf] [code] EMNLP 2019
[pdf] [code] IEEE 2019
[pdf] arXiv 2019
[pdf] [code] ACL 2019
[pdf] [code] NAACL 2019
[pdf] EMNLP 2019
[pdf] [code] NAACL 2018
[pdf] NAACL 2018
[pdf] [code] EMNLP 2018
[pdf] ICASSP 2022
[pdf] ICASSP 2022
[pdf] ICASSP 2022
[pdf] ICASSP 2022
[pdf] IEEE 2021
[pdf] IEEE 2019
[pdf] IJCNN 2019
[pdf] IEEE 2019
[pdf] NAACL 2018
[pdf] [code] InterSpeech 2018
[pdf] IEEE 2017
[pdf] IJCNLP 2017
[pdf] [code] SIGDIAL 2017
[pdf] IEEE 2017
[pdf] [code] IEEE 2017
[pdf] [code] IUI 2016
[pdf] INTEERSPEECH 2016
[pdf] ICMI 2015
[pdf] 2015
[pdf] IEEE 2014
[pdf] IEEE 2013
[pdf] AAAI 2021
[pdf] AAAI 2021
[pdf] ICASSP 2022
[pdf] ICASSP 2022
[pdf] ICASSP 2022
[pdf]
TA Result Based Portable Framework for Spoken Language Understanding ICME 2021
[pdf] EMNLP 2020
[pdf] [code]NACCL 2019
[pdf]Multimed Tools Appl 2017
[pdf]Interspeech 2013
[pdf]EMNLP 2021
[pdf] [code]ACL 2021
[pdf] [code]arXiv 2020
[pdf] [code] EMNLP 2019
[pdf] [code] ACL 2022
[pdf] ICASSP 2022
[pdf] ACL 2020
[pdf] [code]AAAI 2020
[pdf] AAAI 2019
[pdf] AAAI 2019
[pdf] ACL 2019
[pdf] SIGDIAL 2018
[pdf] NAACL 2018
[pdf] NAACL-HLT 2018
[pdf] ACL 2017
[pdf] INTERSPEECH 2017
(collected by the author) [pdf] INTERSPEECH 2017
(inhouse data from Amazon) [pdf] COLING 2016
[pdf] INTERSPEECH 2016
[pdf] EMNLP 2016
[pdf] COLING 2016
[pdf] INTERSPEECH 2016
[pdf] EMNLP 2015
[pdf] IEEE 2015
[pdf] INTERSPEECH 2011
[pdf] EMNLP 2021
[pdf] EMNLP 2021
[pdf] IJCAI 2020
[pdf] [code] EMNLP 2020
[pdf] [code] EMNLP 2020
[pdf] IEEE Access 2020
[pdf] AAAI 2020
[pdf] [code] arXiv 2020
[pdf] [[code]]() EMNLP-IJCNLP 2019
[pdf] EMNLP-IJCNLP 2019
[pdf] NAACL 2019
[pdf] CEUR Workshop 2019
[pdf] arXiv 2019
[pdf] IEEE/ICASSP 2018
[pdf] ACL 2017
[pdf] [code] IEEE 2013
[pdf] ICASSP 2013
[pdf] IEEE 2012
[pdf] ACL 2022
[pdf] [code] EMNLP 2021
[pdf]AAAI 2021
[pdf] [code] ACL 2020
[pdf] [code]arXiv 2020
[pdf] NAACL-HLT 2019
[pdf] AAAI 2019
[pdf] ACL 2018
[pdf] SIGDIAL 2018
[pdf] ACL 2020
[pdf] [code]AAAI 2019
[pdf] SIGDIAL 2018
[pdf] EMNLP 2018
[pdf] INTERSPEECH 2017
[pdf] INTERSPEECH 2017
[pdf] EMNLP 2015
[pdf] INTERSPEECH 2015
[pdf] AAAI 2021
[pdf] [code]AAAI 2021
[pdf] [code]AAAI 2020
[pdf] [code]IJCAI 2020
[pdf]Model | Intent Acc | Slot F1 | Paper / Source | Code link | Conference | |
---|---|---|---|---|---|---|
Co-Interactive(Qin et al., 2021) |
97.7 | 95.9 | A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [[pdf]](https://arxiv.org/pdf/2010.03880.pdf) | https://github.com/kangbrilliant/DCA-Net | ICASSP | |
Graph LSTM(Zhang et al., 2021) |
97.20 | 95.91 | Graph LSTM with Context-Gated Mechanism for Spoken Language Understanding [[pdf]](https://ojs.aaai.org/index.php/AAAI/article/view/6499/6355) | - | AAAI | |
Stack Propagation(Qin et al., 2019) |
96.9 | 95.9 | A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding [[pdf]](https://arxiv.org/pdf/1909.02188.pdf) | https://github.com/LeePleased/StackPropagation-SLU | EMNLP | |
SF-ID+CRF(SF first)(E et al., 2019) |
97.76 | 95.75 | A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling [[pdf]](https://www.aclweb.org/anthology/P19-1544.pdf) | ACL | ||
SF-ID+CRF(ID first)(E et al., 2019) |
97.09 | 95.8 | A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling [[pdf]](https://www.aclweb.org/anthology/P19-1544.pdf) | https://github.com/ZephyrChenzf/SF-ID-Network-For-NLU | ACL | |
Capsule-NLU(Zhang et al. 2019) |
95 | 95.2 | Joint Slot Filling and Intent Detection via Capsule Neural Networks [[pdf]](https://arxiv.org/pdf/1812.09471.pdf) | https://github.com/czhang99/Capsule-NLU | ACL | |
Utterance Generation With Variational Auto-Encoder(Guo et al., 2019) |
- | 95.04 | Utterance Generation With Variational Auto-Encoder for Slot Filling in Spoken Language Understanding [[pdf]](https://ieeexplore.ieee.org/document/8625384) | - | IEEE Signal Processing Letters | |
JULVA(full)(Yoo et al., 2019) |
97.24 | 95.51 | Data Augmentation for Spoken Language Understanding via Joint Variational Generation [[pdf]](https://arxiv.org/pdf/1809.02305.pdf) | - | AAAI | |
CM-Net(Liu et al., 2019) |
99.1 | 96.20 | CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding[[pdf]](https://www.aclweb.org/anthology/D19-1097.pdf) | https://github.com/Adaxry/CM-Net | EMNLP | |
Data noising method(Kim et al., 2019) |
98.43 | 96.20 | Data augmentation by data noising for open vocabulary slots in spoken language understanding [[pdf]](https://www.aclweb.org/anthology/N19-3014.pdf) | - | NAACL-HLT | |
ACD(Zhu et al., 2018) |
- | 96.08 | Concept Transfer Learning for Adaptive Language Understanding [[pdf]](https://www.aclweb.org/anthology/W18-5047.pdf) | - | SIGDIAL | |
A Self-Attentive Model with Gate Mechanism(Li et al., 2018) |
98.77 | 96.52 | A Self-Attentive Model with Gate Mechanism for Spoken Language Understanding [[pdf]](https://www.aclweb.org/anthology/D18-1417.pdf) | - | EMNLP | |
Slot-Gated(Goo et al., 2018) |
94.1 | 95.2 | Slot-Gated Modeling for Joint Slot Filling and Intent Prediction [[pdf]](https://www.aclweb.org/anthology/N18-2118.pdf) | https://github.com/MiuLab/SlotGated-SLU | NAACL | |
DRL based Augmented Tagging System(Wang et al., 2018) |
- | 97.86 | A New Concept of Deep Reinforcement Learning based Augmented General Sequence Tagging System [[pdf]](https://www.aclweb.org/anthology/C18-1143.pdf) | - | COLING | |
Bi-model(Wang et al., 2018) |
98.76 | 96.65 | A Bi-model based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling [[pdf]](https://arxiv.org/pdf/1812.10235.pdf) | - | NAACL | |
Bi-model+decoder(Wang et al., 2018) |
98.99 | 96.89 | A Bi-model based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling [[pdf]](https://arxiv.org/pdf/1812.10235.pdf) | - | NAACL | |
Seq2Seq DA for LU(Hou et al., 2018) |
- | 94.82 | Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding [[pdf]](https://arxiv.org/pdf/1807.01554.pdf) | https://github.com/AtmaHou/Seq2SeqDataAugmentationForLU | COLING | |
BLSTM-LSTM(Zhu et al., 2017) |
- | 95.79 | ENCODER-DECODER WITH FOCUS-MECHANISM FOR SEQUENCE LABELLING BASED SPOKEN LANGUAGE UNDERSTANDING [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7953243) | - | ICASSP | |
neural sequence chunking model(Zhai et al., 2017) |
- | 95.86 | Neural Models for Sequence Chunking [[pdf]](https://arxiv.org/pdf/1701.04027.pdf) | - | AAAI | |
Joint Model of ID and SF(Zhang et al., 2016) |
98.32 | 96.89 | A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding [[pdf]](https://www.ijcai.org/Proceedings/16/Papers/425.pdf) | - | IJCAI | |
Attention Encoder-Decoder NN (with aligned inputs) |
98.43 | 95.87 | Attention-Based Recurrent Neural Network Models for Joint Intent Detectionand Slot Filling [[pdf]](https://arxiv.org/pdf/1609.01454.pdf) | - | InterSpeech | |
Attention BiRNN(Liu et al., 2016) |
98.21 | 95.98 | Attention-Based Recurrent Neural Network Models for Joint Intent Detectionand Slot Filling [[pdf]](https://arxiv.org/pdf/1609.01454.pdf) | - | InterSpeech | |
Joint SLU-LM model(Liu ei al., 2016) |
98.43 | 94.64 | Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks [[pdf]](https://arxiv.org/pdf/1609.01462.pdf) | http://speech.sv.cmu.edu/software.html | SIGDIAL | |
RNN-LSTM(Hakkani-Tur et al., 2016) |
94.3 | 92.6 | Multi-Domain Joint Semantic Frame Parsing using Bi-directional RNN-LSTM [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/IS16_MultiJoint.pdf) | - | InterSpeech | |
R-biRNN(Vu et al., 2016) |
- | 95.47 | Bi-directional recurrent neural network with ranking loss for spoken language understanding [[pdf]](https://ieeexplore.ieee.org/abstract/document/7472841/) | - | IEEE | |
Encoder-labeler LSTM(Kurata et al., 2016) |
- | 95.4 | Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling [[pdf]](https://www.aclweb.org/anthology/D16-1223.pdf) | - | EMNLP | |
Encoder-labeler Deep LSTM(Kurata et al., 2016) |
- | 95.66 | Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling [[pdf]](https://www.aclweb.org/anthology/D16-1223.pdf) | EMNLP | ||
5xR-biRNN(Vu et al., 2016) |
- | 95.56 | Bi-directional recurrent neural network with ranking loss for spoken language understanding [[pdf]](https://ieeexplore.ieee.org/abstract/document/7472841/) | - | IEEE | |
Data Generation for SF(Kurata et al., 2016) |
- | 95.32 | Labeled Data Generation with Encoder-decoder LSTM for Semantic Slot Filling [[pdf]](https://www.isca-speech.org/archive/Interspeech_2016/pdfs/0727.PDF) | - | InterSpeech | |
RNN-EM(Peng et al., 2015) |
- | 95.25 | Recurrent Neural Networks with External Memory for Language Understanding [[pdf]](https://arxiv.org/pdf/1506.00195.pdf) | - | InterSpeech | |
RNN trained with sampled label(Liu et al., 2015) |
- | 94.89 | Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding [[pdf]](http://speech.sv.cmu.edu/publications/liu-nipsslu-2015.pdf) | - | - | |
RNN(Ravuri et al., 2015) |
97.55 | - | Recurrent neural network and LSTM models for lexical utterance classification [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/RNNLM_addressee.pdf) | - | InterSpeech | |
LSTM(Ravuri et al., 2015) |
98.06 | - | Recurrent neural network and LSTM models for lexical utterance classification [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/RNNLM_addressee.pdf) | - | InterSpeech | |
Hybrid RNN(Mesnil et al., 2015) |
- | 95.06 | Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding [[pdf]](https://ieeexplore.ieee.org/document/6998838) | - | IEEE/ACM-TASLP | |
RecNN(Guo et al., 2014) |
95.4 | 93.22 | Joint semantic utterance classification and slot filling with recursive neural networks [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2014/12/RecNNSLU.pdf) | - | IEEE-SLT | |
LSTM(Yao et al., 2014) |
- | 94.85 | Spoken Language Understading Using Long Short-Term Memory Neural Networks [[pdf]](https://groups.csail.mit.edu/sls/publications/2014/Zhang_SLT_2014.pdf) | - | IEEE | |
Deep LSTM(Yao et al., 2014) |
- | 95.08 | Spoken Language Understading Using Long Short-Term Memory Neural Networks [[pdf]](https://groups.csail.mit.edu/sls/publications/2014/Zhang_SLT_2014.pdf) | - | IEEE | |
R-CRF(Yao et al., 2014) |
- | 96.65 | Recurrent conditional random field for language understanding [[pdf]](https://ieeexplore.ieee.org/document/6854368) | - | IEEE | |
RecNN+Viterbi(Guo et al., 2014) |
95.4 | 93.96 | Joint semantic utterance classification and slot filling with recursive neural networks [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2014/12/RecNNSLU.pdf) | - | IEEE-SLT | |
CNN CRF(Xu et al., 2013) |
94.09 | 5.42 | Convolutional neural network based triangular crf for joint intent detection and slot filling [[pdf]]((http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.642.7548&rep=rep1&type=pdf)) | - | IEEE | |
RNN(Yao et al., 2013) |
- | 94.11 | Recurrent Neural Networks for Language Understanding [[pdf]](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/kaisheny-338_file_paper.pdf) | - | InterSpeech | |
Bi-dir. Jordan-RNN(2013) |
- | 93.98 | Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding [[pdf]](https://www.isca-speech.org/archive/archive_papers/interspeech_2013/i13_3771.pdf) | - | ISCA |
Model | Intent Acc | Slot F1 | Paper / Source | Code link | Conference | |
---|---|---|---|---|---|---|
Co-Interactive(Qin et al., 2021) |
98.0 | 96.1 | A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [[pdf]](https://arxiv.org/pdf/2010.03880.pdf) | https://github.com/kangbrilliant/DCA-Net | ICASSP | |
Stack Propagation+BERT(Qin et al., 2019) |
97.5 | 96.1 | A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding [[pdf]](https://arxiv.org/pdf/1909.02188.pdf) | https://github.com/LeePleased/StackPropagation-SLU | EMNLP | |
Bert-Joint(Castellucci et al., 2019) |
97.8 | 95.7 | Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model [[pdf]](https://arxiv.org/pdf/1907.02884.pdf) | - | arXiv | |
BERT-SLU(Zhang et al., 2019) |
99.76 | 98.75 | A Joint Learning Framework With BERT for Spoken Language Understanding [[pdf]](https://ieeexplore.ieee.org/document/8907842) | - | IEEE | |
Joint BERT(Chen et al., 2019) |
97.5 | 96.1 | BERT for Joint Intent Classification and Slot Filling [[pdf]](https://arxiv.org/pdf/1902.10909.pdf) | https://github.com/monologg/JointBERT | arXiv | |
Joint BERT+CRF(Chen et al., 2019) |
97.9 | 96 | BERT for Joint Intent Classification and Slot Filling [[pdf]](https://arxiv.org/pdf/1902.10909.pdf) | https://github.com/monologg/JointBERT | arXiv | |
ELMo-Light (ELMoL) (Siddhant et al., 2019) |
97.3 | 95.42 | Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents [[pdf]](https://arxiv.org/pdf/1811.05370.pdf) | - | AAAI |
Model | Intent Acc | Slot F1 | Paper / Source | Code link | Conference | |
---|---|---|---|---|---|---|
Co-Interactive(Qin et al., 2021) |
98.8 | 95.9 | A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [[pdf]](https://arxiv.org/pdf/2010.03880.pdf) | https://github.com/kangbrilliant/DCA-Net | ICASSP | |
Graph LSTM(Zhang et al., 2021) |
98.29 | 95.30 | Graph LSTM with Context-Gated Mechanism for Spoken Language Understanding [[pdf]](https://ojs.aaai.org/index.php/AAAI/article/view/6499/6355) | - | AAAI | |
SF-ID Network(E et al, 2019) |
97.43 | 91.43 | A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling [[pdf]](https://www.aclweb.org/anthology/P19-1544.pdf) | https://github.com/ZephyrChenzf/SF-ID-Network-For-NLU | ACL | |
CAPSULE-NLU(Zhang et al, 2019) |
97.3 | 91.8 | Joint Slot Filling and Intent Detection via Capsule Neural Networks [[pdf]](https://arxiv.org/pdf/1812.09471.pdf) | https://github.com/czhang99/Capsule-NLU | ACL | |
StackPropagation(Qin et al, 2019) |
98 | 94.2 | A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding [[pdf]](https://arxiv.org/pdf/1909.02188.pdf) | [https://github.com/LeePleased/StackPropagation-SLU. ](https://github.com/LeePleased/StackPropagation-SLU.) | EMNLP | |
CM-Net(Liu et al., 2019) |
99.29 | 97.15 | CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding[[pdf]](https://www.aclweb.org/anthology/D19-1097.pdf) | https://github.com/Adaxry/CM-Net | EMNLP | |
Joint Multiple(Gangadharaiah et al, 2019) |
97.23 | 88.03 | Joint Multiple Intent Detection and Slot Labeling for Goal-Oriented Dialog [[pdf]](https://www.aclweb.org/anthology/N19-1055.pdf) | - | NAACL | |
Utterance Generation With Variational Auto-Encoder(Guo et al., 2019) |
- | 93.18 | Utterance Generation With Variational Auto-Encoder for Slot Filling in Spoken Language Understanding [[pdf]](https://ieeexplore.ieee.org/document/8625384) | - | IEEE Signal Processing Letters | |
Slot Gated Intent Atten.(Goo et al, 2018) |
96.8 | 88.3 | Slot-Gated Modeling for Joint Slot Filling and Intent Prediction [[pdf]](https://www.aclweb.org/anthology/N18-2118.pdf) | https://github.com/MiuLab/SlotGated-SLU | NAACL | |
Slot Gated Fulled Atten.(Goo et al, 2018) |
97 | 88.8 | Slot-Gated Modeling for Joint Slot Filling and Intent Prediction [[pdf]](https://www.aclweb.org/anthology/N18-2118.pdf) | https://github.com/MiuLab/SlotGated-SLU | NAACL | |
Joint Variational Generation + Slot Gated Intent Atten(Yoo et al., 2018) |
96.7 | 88.3 | Data Augmentation for Spoken Language Understanding via Joint Variational Generation [[pdf]](https://arxiv.org/pdf/1809.02305.pdf) | - | AAAI | |
Joint Variational Generation + Slot Gated Full Atten(Yoo et al., 2018) |
97.3 | 89.3 | Data Augmentation for Spoken Language Understanding via Joint Variational Generation [[pdf]](https://arxiv.org/pdf/1809.02305.pdf) | - | AAAI |
Model | Intent Acc | Slot F1 | Paper / Source | Code link | Conference | |
---|---|---|---|---|---|---|
Co-Interactive(Qin et al., 2021) |
98.8 | 97.1 | A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [[pdf]](https://arxiv.org/pdf/2010.03880.pdf) | https://github.com/kangbrilliant/DCA-Net | ICASSP | |
StackPropagation + Bert(Qin et al, 2019) |
99 | 97 | A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding [[pdf]](https://arxiv.org/pdf/1909.02188.pdf) | [https://github.com/LeePleased/StackPropagation-SLU. ](https://github.com/LeePleased/StackPropagation-SLU.) | EMNLP | |
Bert-Joint(Castellucci et al, 2019) |
99 | 96.2 | Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Mode [[pdf]](https://arxiv.org/pdf/1907.02884.pdf) | - | arXiv | |
Bert-SLU(Zhang et al, 2019) |
98.96 | 98.78 | A Joint Learning Framework With BERT for Spoken Language Understanding [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8907842) | - | IEEE | |
Joint BERT(Chen et al, 2019) |
98.6 | 97 | BERT for Joint Intent Classification and Slot Filling [[pdf]](https://arxiv.org/pdf/1902.10909.pdf) | https://github.com/monologg/JointBERT | arXiv | |
Joint BERT + CRF(Chen et al, 2019) |
98.4 | 96.7 | BERT for Joint Intent Classification and Slot Filling [[pdf]](https://arxiv.org/pdf/1902.10909.pdf) | https://github.com/monologg/JointBERT | arXiv | |
ELMo-Light(Siddhant et al, 2019) |
98.38 | 93.29 | Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents [[pdf]](https://arxiv.org/pdf/1811.05370.pdf) | - | AAAI | |
ELMo(Peters et al, 2018;Siddhant et al, 2019 ) |
99.29 | 93.9 | Deep contextualized word representations [[pdf]](https://arxiv.org/pdf/1802.05365.pdf)Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents [[pdf]](https://arxiv.org/pdf/1811.05370.pdf) | - | NAACL/AAAI |