Why did I choose this paper? Because this paper uses GAN for the task of query expansion which is one of the IR tasks that is strongly related to my research.
Main problem:
The main problem of this paper is to find a solution to make the existing methods for query expansion faster and more accurate. Query Expansion (QE) is defined as adding new terms to an input query by a user to make it more precise in order to fulfill the users' needs.
Existing work:
Existing works on the QE can be divided into two categories:
Unsupervised Query Expansion (UQE)
Example: Many classical algorithms
Probability models
Relevance-based language models
Disadvantage: noisy or even harmful
Supervised Query Expansion (SQE)
Example: state-of-the-art in the QE literature
random walk
term dependency-based approach
boosting approach
Learning-based approaches
Disadvantage: higher response time (because of the feature extraction phase)
Inputs:
A set of original queries {q1, ..., qn},
A set of expanded terms {t1, ..., tM }
Outputs:
top k relevant terms to query q from the candidate ones.
Method:
Idea:
Response time: word embedding can be used to avoid the feature extraction phase (time-consuming part) in SQE
Performance:Deep learning can be used to encode the correlation between an arbitrary pair of query and expanded term
Steps:
Using UQE to get expanded terms
word embedding technique is used to transform the terms into vectors
Using GAN, both generative and discriminative models iteratively optimize each other
Generator re-ranks the expanded terms
Discriminator calculates the score for the new ranking
Experimental Setup:
Dataset:
TREC Robust 2004
528,000 high-quality documents
250 queries for experiments
Preprocessing:
Stemming: (Porter)
Stopwords removal: (standard InQuery)
Word embedding: Word2Vec’s Continuous Bag-of-Words (CBOW) (d=100)
Initial expanded terms: 100 terms generated by UQE
Final selected terms: 20 terms (of 100 initial terms)
40% train, 10% validate, 50% test
Basic retrieval model: TFIDF
Metrics:
MAP (Mean Average Precision)
Precision@k (k = [5, 10])
NDCG
Baselines:
state-of-the-art SQE scheme, SQE-TFS response time
traditional UQE, KL divergence E2
UQE retrieval effect
RankSVM retrieval effect
Deep NN retrieval effect
Sequence to sequence learning retrieval effect
BiLSTM
Query-to-Term Attention
Results:
The main contribution of this paper is to propose a fast and high-performance model for QE problem. Results show that the major contribution is in the response time (37% improvement) by removing the feature extraction phase and adding a word-embedding module instead. In addition, SQL-GAN also improves the result compared with the latest deep learning-based QE solutions.
Code:
The code of this paper is unavailable.
Presentation:
There is no available presentation for this paper.
Why did I choose this paper? Because this paper uses GAN for the task of query expansion which is one of the IR tasks that is strongly related to my research.
Main problem:
The main problem of this paper is to find a solution to make the existing methods for query expansion faster and more accurate. Query Expansion (QE) is defined as adding new terms to an input query by a user to make it more precise in order to fulfill the users' needs.
Existing work:
Existing works on the QE can be divided into two categories:
Inputs:
Outputs:
Method:
Idea:
Steps:
Experimental Setup:
Dataset:
Preprocessing:
Metrics:
Baselines:
Results:
The main contribution of this paper is to propose a fast and high-performance model for QE problem. Results show that the major contribution is in the response time (37% improvement) by removing the feature extraction phase and adding a word-embedding module instead. In addition, SQL-GAN also improves the result compared with the latest deep learning-based QE solutions.
Code:
The code of this paper is unavailable.
Presentation:
There is no available presentation for this paper.