neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

[Pipeline Refactor][Text Generation] Add `parse_inputs` operator to TextGeneration #1468

Closed dsikka closed 8 months ago

dsikka commented 8 months ago

Summary

Examples of cases which now work:

With the following pipeline:


pipeline = Pipeline.create(
    task="text_generation",
    model_path=model_path,
    engine_type="deepsparse",
    internal_kv_cache=True,
)

Not explicitly listing prompt or sequences

output = pipeline(
    "What is your favourite snack?", num_return_sequences=2, do_sample=True
)

Listing sequences:

output = pipeline(
    sequences="What is your favourite snack?", num_return_sequences=2, do_sample=True
)

Listing prompt:

output = pipeline(
    prompt="What is your favourite snack?", num_return_sequences=2, do_sample=True
)

Passing generation parameters as kwargs:

output = pipeline(
    "What is your favourite snack?",  num_return_sequences=2, do_sample=True)

Passing generation_config and overriding with kwargs:

output = pipeline(
    "What is your favourite snack?", generation_config={"max_length": 50}, output_scores=True
)