2021 CIKM PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling

(I need to investigate more on this paper. It contains technical concepts)

Main Problem: This paper proposed a Personalized Search framework with Self-supervised Learning (PSSL), with the aim of improving data representation. They consider two encoders in their model: 1-Sentence-encoder--> This encoder learns the embedding of queries and documents. Given an input query, the user may click on several documents in returned results. So, the representation of these documents would be close to each other (document pair). In other scenarios, the user may issue similar queries and click on one document. In this case, the representation of queries should be close to each other (query pair).

2- Sequence-encoder--> This encoder learns users' representation and considers their long-term and short-term search behavior to detect users with similar behavior. Then, this model is used in personalize document ranking (user pair).

Input-Output:

phase: Input: users' search log to extract (document/query/user pair) Output: personalized ranking documents

Previous Works and their Gaps: Previous personalized web search models tried to extract a click-through feature to understand users' search intent. With the advent of deep learning, personalized models develop to create users' profiles. Although they improve users satisfaction, they have two shortcomings. They suffered from data sparsity because they heavily rely on a large amount of data for proper training. Also, they only consider ranking scores based on relatedness between queries and documents. This paper proposed a self-supervised model which considers relatedness between similar users (in terms of behavior) in addition to queries and documents relatedness.

Result: The proposed model outperforms the previous personal web search model.

Data Set: They used AOL dataset and commercial (they didn't name the second dataset).

Gap of this work: This paper focuses on improving the personalized model by training the model on search logs only. Considering users' social information and extracting personalized features from users' social information would improve the model.

Code: https://github.com/smallporridge/PSSL

fani-lab / ReQue

2021 CIKM PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling #6