Closed Ivesfu closed 2 months ago
Thanks for your attention. Both use_vcd and use_cd have the same decoding strategy (i.e., use_cd) in the vcd_sample.py. cd_alpha is set to 1 (diffs = (1+cd_alpha)next_token_logits - cd_alphanext_token_logits_cd) and Adaptive Plausibility Constraints (v2) is the same as VCD. We add codes comments.
Thank you for your excellent work! I have a few questions after reviewing the code.
I understand that vcd_sample should support three modes: use_vcd, use_cd, and use_icd, but it seems that only the latter two are implemented. Additionally, at this line, the implementation appears to differ from the paper. Specifically, cd_alpha and cd_beta don’t seem to be utilized.
Could you please advise on how to modify the code to match the paper’s implementation? Have you considered updating the code to reflect the settings used in the paper?
Thank you.