Main problem:
This paper covers recent progress in QE techniques and covers research on automatic, manual and interactive QE techniques.
Query Expansion Approaches:
1- Global analysis: QE techniques implicitly select expansion terms from hand-built knowledge resources or from large corpora for reformulating the initial query
Linguistic approaches: analyze the expansion features such as lexical, morphological semantic and syntactic term relationships, to reformulate or expand the initial query terms
stemming analysis: reducing words to their root word
semantic analysis: finding synonyms of words
syntactic analysis: uses the enhanced relational features of the query terms for expanding the initial query. (such as term co-occurrence)
Corpus-based approaches: examine the contents of the whole text corpus to recognize the expansion features to be utilized for QE.
term clustering: groups document terms into clusters based on their co-occurrences
concept-based term: characterized each word by an embedded vector and analysis of the corpus using word embeddings, then select the expansion
Searched log-based approaches: analysis the search logs
user query log:
query documents relationships: the features are extracted on relational behavior of queries.
Web-based approaches: These approaches include Wikipedia and anchor texts from websites for expanding the user’s original query
2- Local analysis: QE techniques select expansion terms from the collection of documents retrieved in response to the user’s initial (unmodified) query
Relevance feedback: the user’s feedback about whether or not the retrieved documents are relevant to the user’s query is collected
Pseudo-relevance feedback: the feedback collection process is automated by directly using the top-ranked documents– retrieved in response to the initial query – for QE.
Previous Works and their Gaps:
1- reviewed ontology based QE techniques, which are domain specific.
2- reviewed the major QE techniques, data sources, and features in an IR system
gap: covers only automatic query expansion (AQE) techniques and does not include recent research on personalized
social documents, term weighting and ranking methods, and categorization of several data sources
annotation data3- solution: proposed a query suggestion algorithm that presents labeled query suggestion clusters so that the user can make comparisons across multiple entities (e.g. company names).
Gap: a temporal point of view is not considered in these structured query suggestion methods.
Gap:
Did not discuss the temporal QE an its challenges.
Results:
For global analysis the corpus-based approaches are more effective than linguistic-based approaches. The reason is that linguistic-based approaches require a concrete linguistic relation (based on sense, meaning, concept etc.) between a query term and a relevant term for the latter to be discovered, while corpus-based approaches can discover the same relevant term simply based on co-occurrences with the query term.
For local analysis: relevance feedback performed better than pseudo-relevance feedback. The primary reason behind this is that pseudo-relevance feedback depends on the execution of the user’s initial query; if the initial query is poorly formulated or ambiguous, then the expansion terms extracted from the retrieved documents may not be relevant.
@ZahraTaherikhonakdar
This is a very good survey. Now, you understand the ReQue methods better, right?
Also, you see how a survey comes up with categorization and comparisons.
Main problem: This paper covers recent progress in QE techniques and covers research on automatic, manual and interactive QE techniques.
Query Expansion Approaches: 1- Global analysis: QE techniques implicitly select expansion terms from hand-built knowledge resources or from large corpora for reformulating the initial query
Linguistic approaches: analyze the expansion features such as lexical, morphological semantic and syntactic term relationships, to reformulate or expand the initial query terms
Corpus-based approaches: examine the contents of the whole text corpus to recognize the expansion features to be utilized for QE.
Searched log-based approaches: analysis the search logs
Web-based approaches: These approaches include Wikipedia and anchor texts from websites for expanding the user’s original query
2- Local analysis: QE techniques select expansion terms from the collection of documents retrieved in response to the user’s initial (unmodified) query
Relevance feedback: the user’s feedback about whether or not the retrieved documents are relevant to the user’s query is collected
Pseudo-relevance feedback: the feedback collection process is automated by directly using the top-ranked documents– retrieved in response to the initial query – for QE.
Previous Works and their Gaps: 1- reviewed ontology based QE techniques, which are domain specific. 2- reviewed the major QE techniques, data sources, and features in an IR system gap: covers only automatic query expansion (AQE) techniques and does not include recent research on personalized
social documents, term weighting and ranking methods, and categorization of several data sources annotation data3- solution: proposed a query suggestion algorithm that presents labeled query suggestion clusters so that the user can make comparisons across multiple entities (e.g. company names). Gap: a temporal point of view is not considered in these structured query suggestion methods.
Gap: Did not discuss the temporal QE an its challenges.
Results: For global analysis the corpus-based approaches are more effective than linguistic-based approaches. The reason is that linguistic-based approaches require a concrete linguistic relation (based on sense, meaning, concept etc.) between a query term and a relevant term for the latter to be discovered, while corpus-based approaches can discover the same relevant term simply based on co-occurrences with the query term. For local analysis: relevance feedback performed better than pseudo-relevance feedback. The primary reason behind this is that pseudo-relevance feedback depends on the execution of the user’s initial query; if the initial query is poorly formulated or ambiguous, then the expansion terms extracted from the retrieved documents may not be relevant.