microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
22.16k stars 3.31k forks source link

Python: migrate GH Action dependencies for Ollama model downloads, move to Azure assets #9543

Open moonbox3 opened 3 weeks ago

moonbox3 commented 3 weeks ago

During SK integration tests, we perform 5 Ollama model pulls:

ollama pull ${{ vars.OLLAMA_CHAT_MODEL_ID }}
ollama pull ${{ vars.OLLAMA_CHAT_MODEL_ID_IMAGE }}
ollama pull ${{ vars.OLLAMA_CHAT_MODEL_ID_TOOL_CALL }}
ollama pull ${{ vars.OLLAMA_TEXT_MODEL_ID }}
ollama pull ${{ vars.OLLAMA_EMBEDDING_MODEL_ID }}

This puts a lot of stress on the network, is prone to failure, and adds latency. For example, here's a failure: https://github.com/microsoft/semantic-kernel/actions/runs/11685739831/job/32539906324#step:10:966.

We do need coverage for AI connectors; however, it may make more sense to deploy one Azure resource that Ollama can use to handle all the chat completion related operations, and one embedding model for those tests.

moonbox3 commented 3 weeks ago

This model may work for the tool call scenario: https://ollama.com/library/smollm2:135m or this small one: https://ollama.com/library/moondream