Open DelaramRajaei opened 1 year ago
@DelaramRajaei thanks for the summary. would you please explain the entire pipeline by an example? some parts are confusing for me like " query change on the semantic level", etc
@hosseinfani I added an example to the summary.
Main problem
In session search, historical interaction between the user and the search engine is helpful in document ranking performance. Not all the information is helpful and some of them may mislead the query rewriting system. Some systems use this information but do not consider semantics. This paper focuses on rewriting a query by calculating word weights with the help of the user's history in searching such query, without losing its concept.
Related Works & Their Gaps
Gap: term-level information is lost and does not show the important terms in historical queries.
Proposed Method
HQCN model: Historical Query Change Aware Ranking Network
For example, a user is conducting a search session looking for information about computer science. The session context, denoted as S, includes the historical queries and their corresponding clicked documents. Here's an example of S: S = [(Query: "Machine learning algorithms", Clicked Document: "Introduction to Machine Learning") (Query: "Python programming tutorials", Clicked Document: "Python for Beginners") (Query: "Data science tools", Clicked Document: "Top Data Science Tools 2023")]
The user enters a new query related to computer science, let's call it 𝑞𝑡: Current Query: "Deep learning frameworks comparison"
This model is divided into four parts:
Query term weighting(term-level change): Calculate the term-level query change by three weights (removed, added, and retained terms). Query classification is used as an auxiliary learning task. The changed query fits into one of the four classes: generalization, exploitation, exploration, and new task. In this case, "deep learning," "frameworks," and "comparison" are added terms, and there are no removed terms since this query is entirely new.
Representation-based matching(semantic-level change): model the query change on the semantic level. Using the calculated query term weights, the system applies Transformers to the representations of removed terms, added terms, retained terms, and the original query representation to model interactions between queries. This process helps capture the evolving user intent and context. For example, it checks if the candidate documents contain information about "Deep learning frameworks comparison". The model assigns scores based on this matching process.
Term-based interaction: Make fine-grained interaction between each query and candidate document --> use attentive kernel pooling HQCN also leverages the term weights to facilitate interactions between each candidate document and the historical queries in the session context. This interaction helps capture the context and relevance of the document to previous queries in computer science. It's as if the system is considering how well the document aligns with the user's evolving information needs.
Document scoring: Calculate tf-idf, common terms, and similarity between the embedding of each query and the candidate. All the components come together to calculate a final ranking score for each candidate document related to "Deep learning frameworks comparison." These scores determine the order in which the documents will appear in the search results. Documents that best match the user's intent and context in the context of "Deep learning frameworks comparison" are assigned higher scores and are presented prominently in the search results.
Input/Output
Input: User's query Output: Sort and compute the ranking score for the candidate documents based on the user's query.
Data AOL and Tiangong-ST