neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

[Pipeline Refactor] Add `Pipeline.create` method to initialize pipelines #1457

Closed dsikka closed 8 months ago

dsikka commented 8 months ago

Summary

Testing

from deepsparse.v2.pipeline import Pipeline
model_path = "hf:neuralmagic/mpt-7b-chat-pruned50-quant"

pipeline = Pipeline.create(
    task="text_generation",
    model_path=model_path,
    engine_kwargs={"engine_type": "deepsparse"},
    internal_kv_cache=True,
    continuous_batch_sizes=[2, 4],
)