ECIR2021.User Engagement Prediction for Clarification in Search

soroush-ziaeinejad commented 2 years ago

Why did I choose this paper? Because clarification can be easily mapped to query expansion task.

Main problem:

The problem is to predict the user engagement level in clarification questions in search queries to find out when and how a clarification question should pop up in order to increase user satisfaction. A sample clarification question is when a user puts the query "how to set up a list in outlook" and then a clarification question pops up to ask "which version of outlook?". The question is, is it important to ask this question? Are the different versions of outlook different in setting up a list? Does it matter? This paper tries to address this problem by analyzing the user engagement to these clarification questions.

Existing work:

They looked into the literature by dividing their contribution into two areas:

conversational and web search clarification (multi-turn interaction)
- Methods: Reinforcement learning and Transformers
- Gaps: the necessity of asking clarification questions is an unexplored topic
engagement level prediction
- How to estimate the engagement level:
- self-reported questionnaires
- facial expression
- web analytics
- User interactions with clarification pane
- Gaps: lack of research work based on the retrieved documents (search results).
  Inputs:
initial query q,
clarification question c,
list of candidate answers A,
retrieved results R
Outputs:

The user engagement level (y) in [0,10] for each input tuple.

Method:

Given inputs (q, c, A, R), their model called ELBERT outputs a joint representation of inputs utilizing ALBERT as the encoder. Then they do the regression on the output by adding two hidden layers to the end of the model.

Experiments:

Dataset: MIMICS: a large-scale collection of datasets for search clarification
- MIMICS-Click: 400k unique queries, clarification panes, clicks, SERP
- MIMICS-ClickExplore: same but contains 60k queries
- MIMICS-Manual: 2k query-clarification pairs, answer sets, SERP
Metrics: MSE (lower is better), MAE (lower is better), Coefficient of Determination (R²)(higher is better)

Baselines:
- Mean, Median, Normal sampling of the engagement levels
- Linear Regression: least squares
- SVR: linear and RBF kernels
- Random Forests
- LSTM: bidirectional
  Results:
Performance comparison Comparisons are done on the full dataset and a portion of the dataset where the engagement level is more than zero. On the former dataset, the proposed method outperforms other baselines with MSE and R² but the Median baseline outperforms with MAE. This is because a large portion have engagement level = 0 and the Median baseline just returns 0. However, the proposed method outperforms other baselines with all metrics on the second dataset.
Effect of SERP elements An experiment is done to find out the effect of adding SERP elements as inputs to their proposed model and the result shows adding these elements always improves the model. Adding query and the title of result web pages is the best setting to achieve better performance of the model.

Code:

The code of this paper is available here

Presentation:

There is no available presentation for this paper.

hosseinfani commented 2 years ago

@soroush-ziaeinejad where is the body?!

soroush-ziaeinejad commented 2 years ago

@hosseinfani Summary is updated.

hosseinfani commented 2 years ago

@soroush-ziaeinejad thank you for the nice summary.

fani-lab / SEERa