1) Seperate the structured representation and make it available at a dedicated field for people want to use it externally (e.g for using open ai api) (2) change existing formats to use this mechanism
There are downsides for using HF Tokenizer chat template - it requires access to the HF model page (e.g. sometime requires huggingface token login ).
I think we should consider a general jinja format - so people can just copy the jinja string and use it for formatting.
1) Seperate the structured representation and make it available at a dedicated field for people want to use it externally (e.g for using open ai api) (2) change existing formats to use this mechanism
There are downsides for using HF Tokenizer chat template - it requires access to the HF model page (e.g. sometime requires huggingface token login ). I think we should consider a general jinja format - so people can just copy the jinja string and use it for formatting.
Originally posted by @yoavkatz in https://github.com/IBM/unitxt/issues/988#issuecomment-2206379793