collabora / WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
1.45k stars 101 forks source link

Why the preference for Dolphin Phi over regular Phi, Mistral or Llama2? #22

Closed DamianB-BitFlipper closed 5 months ago

DamianB-BitFlipper commented 5 months ago

Just curious to understand your decision process on this.

zoq commented 5 months ago

Dolphin Phi over regular Phi; you can use any tuned phi model, you can just update:

https://github.com/collabora/WhisperFusion/blob/main/docker/scripts/build-dolphin-2_6-phi-2.sh#L10

with whatever you want to use, but make sure to update the model format as well. On paper Dolphin Phi outperforms regular Phi on certain tasks, that was the main reason why we added it in the first place. That said, we couldn't really see a huge difference between the two, but since both models worked we left dolphin as the default. We tested Mistral and LLama as well, but both models are bigger, which comes with extra latency.

We are currently evaluating if we can finetune phi on specific tasks and have a mixture of tuned phi models, which would give us a good balance of both worlds.

DamianB-BitFlipper commented 5 months ago

What do you mean by “update the model format as well?"

zoq commented 5 months ago

Different models use different prompt formats:

https://github.com/collabora/WhisperFusion/blob/main/llm_service.py#L170-L188

implements:

https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF#prompt-template-chatml

DamianB-BitFlipper commented 5 months ago

Oh I understand now. Thanks for the clarification.