Add Ollama inference mocks

Summary: This commit adds mock support for Ollama inference testing.

Use --mock-overrides during your test run:

pytest llama_stack/providers/tests/inference/test_text_inference.py -m "ollama" --mock-overrides inference=ollama --inference-model Llama3.2-1B-Instruct

The test will run using Ollama provider using mock Adapter.

Test Plan: Run tests

pytest llama_stack/providers/tests/inference/test_text_inference.py -m "ollama" --mock-overrides inference=ollama --inference-model Llama3.2-1B-Instruct -v -s --tb=short --disable-warnings

====================================================================================================== test session starts ======================================================================================================
platform darwin -- Python 3.11.10, pytest-8.3.3, pluggy-1.5.0 -- /opt/homebrew/Caskroom/miniconda/base/envs/llama-stack/bin/python
cachedir: .pytest_cache
rootdir: /Users/vivic/Code/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.24.0, anyio-4.6.2.post1
asyncio: mode=Mode.STRICT, default_loop_scope=None
collected 56 items / 48 deselected / 8 selected

llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_model_list[-ollama] Overriding inference=ollama with mocks from inference_ollama_mocks
Resolved 4 providers
 inner-inference => ollama
 models => __routing_table__
 inference => __autorouted__
 inspect => __builtin__

Models: Llama3.2-1B-Instruct served by ollama

PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_completion[-ollama] PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_completions_structured_output[-ollama] SKIPPED (This test is not quite robust)
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_non_streaming[-ollama] PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_structured_output[-ollama] SKIPPED (Other inference providers don't support structured output yet)
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_streaming[-ollama] PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_with_tool_calling[-ollama] PASSED
llama_stack/providers/tests/inference/test_text_inference.py::TestInference::test_chat_completion_with_tool_calling_streaming[-ollama] PASSED

==================================================================================== 6 passed, 2 skipped, 48 deselected, 6 warnings in 0.11s ====================================================================================

Stack created with Sapling. Best reviewed with ReviewStack.

-> #503
490

meta-llama / llama-stack

Add Ollama inference mocks #503

490