Closed dmitrysarov closed 5 months ago
Thanks for you contribution. The code looks good to me. I like that you allow the baseline to be configured in show_result.py
when generating all the battles. If we configure a model as the baseline we should also configure the model as the anchor when compute the bradley terry coefficients and win-rates. Could you add the code to configure that as well? I can merge it once you do. If not, I can do it instead.
@CodingWithTim Hope I understood you correctly. I've added this part
Thanks you got the right idea. This is great! I think one last thing is get_bootstrap_result
also need to be able to support configurable baseline as well. Inside get_bootstrap_result
it also calls compute_mle_elo
. Could you add support for this as well? Thanks!
@CodingWithTim yeah, overlooked that part, sorry. Now it's there
@dmitrysarov Sorry about the late review, was busy with upcoming releases. This code works wonderfully for me. We really appreciate your contributions!
slightly formatting configurable number_of_judgment_attempts configurable baseline_model in show_result.py