Update LLM generation docs to use chat template

awni commented 2 months ago

Since the docs use an instruct model, they should really use a chat template as well to be correct.

Also added a __call__ overload to TokenizerWrapper to appease type-checking.

Closes #972

Jonathan-Dobson commented 2 months ago

This patch does not seem to fix every type check issue. Mainly the apply_chat_template call:

Object of type "NaiveStreamingDetokenizer" is not callable
  Attribute "__call__" is unknownPylance[reportCallIssue]
(function) apply_chat_template: NaiveStreamingDetokenizer | Any

Additionally, there are also a couple of issues with the imports:

[{
    "message": "\"load\" is not exported from module \"mlx_lm\"\n  Import from \"mlx_lm.utils\" instead",
},{
    "message": "\"generate\" is not exported from module \"mlx_lm\"\n  Import from \"mlx_lm.utils\" instead",
},{
    "message": "Object of type \"NaiveStreamingDetokenizer\" is not callable\n  Attribute \"__call__\" is unknown",
}]

I solved the import issues by modifying my import as per the lint message, but the NaiveStreamingDetokenizer error is still an issue, after adding the __call__ patch according to this PR

awni commented 2 months ago

@Jonathan-Dobson if you put the linting / type checking commands you use into a separate issue that would be the best way for us to go about getting them resolved. That way we can reproduce them and fix them. (You are also welcome to send PRs yourself if you feel comfortable doing so).

ml-explore / mlx-examples

Update LLM generation docs to use chat template #973