✨Deepsparse Backend implementation - Githubissues

neuralmagic / guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Apache License 2.0

159 stars 11 forks source link

✨Deepsparse Backend implementation #29

Open parfeniukink opened 2 months ago

parfeniukink commented 2 months ago

Summary

The Deepsparse Backend interface is implemented.

Optional dependency is specified in pyproject.toml file
Some ruff errors are suppressed
DeepsparseBackend gets configurations from CLI, environment or defaults
settings.py::DeepsparseSettings includes all the settings
tests/unit/backend/deepsparse.py includes unit tests
- TestTextGenerationPipeline mocks the deepsparse.pipeline.Pipeline
- tests are skipped if the Python version is not in a range between Python3.8 and Python3.11 (including).

Usage

This is an example of a command you can use in your terminal:

--data=openai_humaneval: determines the dataset
--model=/local/path/my_model: determines the local path to the model object. If not specified - the env variable will be used.
```
python -m src.guidellm.main --data=openai_humaneval --max-requests=1 --max-seconds=20 --rate-type=constant --rate=1.0 --backend=deepsparse --model=/local-path
```
Environment configuration

The model could also be set with GUIDELLM__LLM_MODEL. If the CLI value or environment variable is not set, then the default will be used. Currently, the default model is: mistralai/Mistral-7B-Instruct-v0.3.