[Text Generation][V2] End-to-end tests - Githubissues

neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs

https://neuralmagic.com/deepsparse/

Other

3.01k stars 176 forks source link

[Text Generation][V2] End-to-end tests #1402

Closed dbogunowicz closed 11 months ago

dbogunowicz commented 11 months ago

Feature Description

Successfully migrated LLM end-to-end tests from v1 to v2. Missing elements:

Tests for no-kv cache pipeline are currently disabled (will be enabled once the TextGenerationPipelineNoCache is implemented
We are not testing the kv cache state post-inference. To do so, we need to enable the debug flag, as we did previously in v1. I do not see it as a blocker for this PR, but I'd propose to scope it as a part of the next sprint.

Testing

Both pytest tests/deepsparse/v2/unit/ pytest tests/deepsparse/v2/integration_tests/ are green.