Summary

Branch to update pathways such that the new text generation pipeline can be used
All new pipeline components and updated pipelines (text_generation, image_classification) were moved from the v2 pathway and are now the default pipelines that will be used
The old files have been moved to a legacy folder under src. Old text_generation and image_classification folders were also moved to legacy subfolders in their respective modules
To make it easier, text_generation schemas were moved to a separate folder under transformers/schemas making it easy for both the new and old pipelines to pull them in

Testing

You can load the new pipelines using the normal Pipeline.create(...) method.
If the pipeline has not been registered using the new registry/migrated to use the new framework, you can use Pipeline.create(...) as well. This will use the legacy pipeline class under the hood.
To use the legacy pipeline (old text generation and old image classification) which have already been migrated, use have to use the legacy Pipeline under legacy/pipeline.py

All 3 examples are shown below.

Example:

Run the new text generation pipeline (with continuous batching, if that's what your heart desires):


from deepsparse import Pipeline
from deepsparse.transformers.schemas.text_generation_schemas import TextGenerationInput

pipeline = Pipeline.create(
    task="text_generation",
    model_path=model_path,
    engine_type="deepsparse",
    internal_kv_cache=False,
    continuous_batch_sizes=[2, 4]
)

prompts = [["Hello there!", "The sun shined bright", "The dog barked"]]
for i in range(len(prompts)):
    input_value = TextGenerationInput(
        prompt=prompts[i],
        generation_kwargs={
            "num_return_sequences": 4,
            "max_new_tokens": 20,
            "do_sample": True,
        },
    )
    output = pipeline(input_value)
    for i in output.generations:
        print(i)
        print("\n")

Run the old text_generation pipeline:


from deepsparse.legacy.pipeline import Pipeline
from deepsparse.transformers.schemas.text_generation_schemas import TextGenerationInput

model_path = "hf:neuralmagic/mpt-7b-chat-pruned50-quant"
pipeline = Pipeline.create(
    task="text_generation",
    model_path=model_path,
    engine_type="deepsparse",
    internal_kv_cache=True,
)

prompts = [["Hello there!", "The sun shined bright", "The dog barked"]]
input_value = TextGenerationInput(
    prompt=prompts[0],
    generation_kwargs={
        "num_return_sequences": 4,
        "max_new_tokens": 20,
        "do_sample": True,
    },
)

output = pipeline(input_value)
for i in output.generations:
    print(i)
    print("\n")

Run any pipeline that has not yet been migrated to use the new `Pipeline` class/framework

from deepsparse import Pipeline

sa_pipeline = Pipeline.create(
    task="sentiment-analysis",
    model_path="zoo:bert-large-sst2_wikipedia_bookcorpus-pruned90_quantized"
)

inference = sa_pipeline("I love it!")

Next Steps

Some of the tests needs to be updated to reflect the new pipeline changes (example: test_pipeline.py and test_dynamic_import.py). Right now they are testing the legacy pipeline.
To reflect then new text generation pipeline, test_text_generation.py needs to be updated. It is currently testing the legacy pipeline.
Update PIpeline.to_config/Pipeline.from_config such that new pipelines can be loaded in the server. Right now, only old pipelines can run on the server

neuralmagic / deepsparse

[Pipeline Refactor] Migration #1460

Summary

Testing

Example:

Run the new text generation pipeline (with continuous batching, if that's what your heart desires):

Run the old text_generation pipeline:

Run any pipeline that has not yet been migrated to use the new `Pipeline` class/framework

Next Steps

neuralmagic / deepsparse

[Pipeline Refactor] Migration #1460

Summary

Testing

Example:

Run the new text generation pipeline (with continuous batching, if that's what your heart desires):

Run the old text_generation pipeline:

Run any pipeline that has not yet been migrated to use the new Pipeline class/framework

Next Steps

Run any pipeline that has not yet been migrated to use the new `Pipeline` class/framework