Closed DamianB-BitFlipper closed 5 months ago
Dolphin Phi over regular Phi; you can use any tuned phi model, you can just update:
https://github.com/collabora/WhisperFusion/blob/main/docker/scripts/build-dolphin-2_6-phi-2.sh#L10
with whatever you want to use, but make sure to update the model format as well. On paper Dolphin Phi outperforms regular Phi on certain tasks, that was the main reason why we added it in the first place. That said, we couldn't really see a huge difference between the two, but since both models worked we left dolphin as the default. We tested Mistral and LLama as well, but both models are bigger, which comes with extra latency.
We are currently evaluating if we can finetune phi on specific tasks and have a mixture of tuned phi models, which would give us a good balance of both worlds.
What do you mean by “update the model format as well?"
Different models use different prompt formats:
https://github.com/collabora/WhisperFusion/blob/main/llm_service.py#L170-L188
implements:
https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF#prompt-template-chatml
Oh I understand now. Thanks for the clarification.
Just curious to understand your decision process on this.