hyz20 / CWM

The source code of ``Counteracting Duration Bias in Video Recommendation via Counterfactual Watch Time''. In KDD'24
2 stars 1 forks source link

Question on Using D2co-defined Interest Labels #1

Open justopit opened 3 weeks ago

justopit commented 3 weeks ago

Hi,

I am curious about the performance of your model when using the interest labels originally defined by D2co. Could you provide some insights or results on this?

Thank you!

hyz20 commented 3 weeks ago

Hi justopit,

Thank you very much for your interest in our work. The interest label we defined in D2Co are

r_{\mathbf{x}} = \left\{
\begin{aligned}
    & 1,\quad \mathrm{if}~~ (d\leq18s \land w=d) \lor (d>18s \land w>18s) \mathrm{;} \\
    & 0,\quad \mathrm{else;} 
\end{aligned}
\right.

And the one we defined similarly in CWM

\begin{split}
    r_{\mathbf{x}} = \left\{
    \begin{aligned}
        & 1,\quad \mathrm{if}~~ (d\leq w_{\text{0.7} \land w=d) \lor (d>w_{\text{0.7} \land w>w_{\text{0.7}) \mathrm{;} \\
        & 0,\quad \mathrm{else;} 
    \end{aligned}
    \right.
\end{split}

The only difference between these two definitions is the replacement of $18s$ in D2Co with $w{0.7}$ in CWM ($w{0.7}$ means the 70% quantile in watch time). And in both the KuaiRand and WeChat, the $w_{0.7}$ is $17s$, which is similar to the $18s$ we defined in D2Co.

The $18s$ is the definition of Long_view in the original KuaiRand dataset, and they considered this label suitable as an indicator of user interest. However, during the review process of D2Co, we found that some reviewers did not approve of this explicit definition, so we replaced $18s$ with $w_{0.7}$ in CWM.

As you can see, unbiased evaluation of video watching is an open question. The definition of such an interest label that we have adopted may not be perfect either. We are also looking forward to follow-up work to address this open question.