epfLLM / Megatron-LLM

distributed trainer for LLMs
Other
504 stars 73 forks source link

Instruct loss scalar #58

Closed AleHD closed 10 months ago

AleHD commented 10 months ago

The current instruction tuning implementation masks the losses of all "user" tokens to zero. This PR adds a --scalar_loss_mask argument that makes it possible to just scale the loss of those tokens with a different value, e.g. 0.1 to make it possible for LLMs to produce "user" prompts too.

AleHD commented 10 months ago

Waiting on the metrics PR. Once #55 merges with main successfully, merging this branch will be easier.