Closed njalan closed 5 months ago
@njalan could you provide some additional context here on the setup (LLM, training data, etc)
@zainhoda I am using Baichuan-7B as LLM and training consists of querion-sql pairs and documentation for business knowledge(it is mixed with Chinese and englist.) All the question are asked by Chinese
Sometimes it provided me the same two query splitted by -- 或者(it means OR in Englist) So is any any prompt to avoid it?
I think the easiest path might be to override the extract_sql
method:
https://github.com/vanna-ai/vanna/blob/main/src/vanna/base/base.py#L126-L150
You can provide your own extract_sql
method that will remove the unnecessary characters
Sometime there are Chinese at the end of Sql :