Question about the "Theoretical Analysis" in the paper

XuZhao0 / Model-Selection-Reasoning

Model Selection with Large Language Models for Reasoning (EMNLP2023 Findings)

29 stars 5 forks source link

Hi there!

I've thoroughly read your excellent paper, and I am intrigued by the section "Theoretical Analysis". Your analysis is interesting and quite solid.

Your analysis proves that " it is possible to achieve improvement, even if we do not achieve ρx > 0.5 in some instances.". This conclusion is convincing and intuitive.

However, I think that this theorm, while offering a strong foundation, primarily establishes the"possibility" of the improment, rather than guaranteeing or supporting the improvement of model selection?

As such, the critical question becomes: How to design a method or metric that can guarantee the improvement in model selection.

I would like to exchange ideas and engage in a discussion with you on this topic :)

Thank you for your interest.

We provide a detailed analysis in Appendix A, where we construct such instances concretely. But it provides the instances from a mathematical point of view. So it is worth thinking about how to construct such cases from a practical point of view.

Please notice that "In particular, when $\alpha\ge 0.5$, we have $\rho \rightarrow 0$ as $\alpha\rightarrow 1$ and $(\delta/(n-T)) \rightarrow 0$ (with $\lambda = 1-\frac{\beta}{T\epsilon}(n-T-\delta)$ )". Due to $\alpha= \frac{T}{n}$ and $\alpha\rightarrow 1$, $T$ increases. And as $\beta=\frac{\epsilon T}{n-T} \in (0,1)$, $\epsilon$ needs to decrease. So with $R(x) = -\epsilon$ for $x\in S[X]$, we want two base methods to perform similarly on $S[X]$. In this case, we can have $\rho \rightarrow 0$.

But as we discussed in Proposition 1, we want $|R(x)|$ to be large. So it is like a trade-off. I am still thinking how to better design two base methods. We welcome your discussion and insights.

XuZhao0 / Model-Selection-Reasoning

Question about the "Theoretical Analysis" in the paper #1