Open dakru012 opened 15 hours ago
That's a good catch, thanks @dakru012! Do you want to submit a PR to fix it?
That's a good catch, thanks @dakru012! Do you want to submit a PR to fix it?
I think these lines within concatenated_forward
are the culprit, names should be [ref_chosen_logps
, ref_rejected_logps
] instead of [chosen_logps
, rejected_logps
] then need to handle the same case at compute_ref_log_probs
function
output["chosen_logps"] = all_logps[:num_examples]
output["rejected_logps"] = all_logps[num_examples:]
Let me know if the PR is there otherwise I can include the relevant fixes inside #2426 or made a new one
@SwayamInSync I don't think that's the problem. I will take a look at it again and do a PR, but it is already midnight here so I gotta sleep first 😴
System Info
Information
Tasks
examples
folderReproduction
Expected behavior
Hi, I have some questions about a potential issue or misunderstanding on my side. The point of
precompute_ref_log_probs
is to calculate the ref log probabilities for the whole dataset before the actual training process, and then later during training we can just load the precomputed probabilities while saving the GPU memory space for the ref model, right? However, it seems like the precomputed log probabilities are never actually loaded.In the corresponding part in
get_batch_loss_metrics()
:The if condition is never true, even if the log probabilities were computed, resulting in unnecessary computations for the ref model. This is because the
PreferenceCollator
does not include theref_chosen_logps
andref_rejected_logps
in the batch.I made some changes to the Collator to include those, but first I wanted to make sure that I understood the
precompute_ref_log_probs
argument correctly.Checklist