BatsResearch / bonito

A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
BSD 3-Clause "New" or "Revised" License
702 stars 46 forks source link

What are the bare-bone template of the Bonito? #16

Closed pacozaa closed 8 months ago

pacozaa commented 8 months ago

I would like to use this model with Ollama or llama.cpp but I would like to know the bare-bone explanation of bonito's template. Would you mind giving a short explanation?

pacozaa commented 8 months ago

Ah ha, your paper explain it!

<|tasktype|>
Yes-no question answering
<|context|>
Zinedine Zidane -- After retiring as a player, Zidane
transitioned into coaching, becoming assistant coach at
Real Madrid… after the victory, he resigned as Real
Madrid coach.
<|task|>

Still don't mind more explanation and examples though

Cheers!

nihalnayak commented 8 months ago

You are right. We have included the template in the paper. We have also included the preprocessing step in abstract.py. Please look at the following lines of code.

https://github.com/BatsResearch/bonito/blob/0b6b23ddc1e0aaeefcd9072352e7166a281f5706/bonito/abstract.py#L26-L71

Hope this helps! đŸ˜„

pacozaa commented 8 months ago

In case anyone stumbles on this here is Ollama library you can run https://ollama.com/pacozaa/bonito

And quantize and convert to gguf article here https://medium.com/@sarinsuriyakoon/convert-pytorch-model-to-quantize-gguf-to-run-on-ollama-5c5dbc458208