Closed okuchaiev closed 9 months ago
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.
I propose
class TextGeneration
(from whichMegatronGPTModel
inherits) to add.chat(dict)
method. In addition to thatMegatronGPTModel
should have member method or properties which would allow to set or get model's template. That template should be serialized/retrieved frommodel._cfg
object and saved to .yaml file during model serialization.Context. Base models do not require this method. But aligned models require setting the correct template for how user, assistant turns as well as system prompt need to be presented to the model. This means that the user of aligned .nemo checkpoint need to chase model's developer or documentation to understand what template they should set.
Example usage:
If user calls
.chat
method on base model it should raise an exception saying chat template is not set.