Successfully migrated LLM end-to-end tests from v1 to v2.
Missing elements:
Tests for no-kv cache pipeline are currently disabled (will be enabled once the TextGenerationPipelineNoCache is implemented
We are not testing the kv cache state post-inference. To do so, we need to enable the debug flag, as we did previously in v1. I do not see it as a blocker for this PR, but I'd propose to scope it as a part of the next sprint.
Testing
Both
pytest tests/deepsparse/v2/unit/pytest tests/deepsparse/v2/integration_tests/
are green.
Feature Description
Successfully migrated LLM end-to-end tests from v1 to v2. Missing elements:
TextGenerationPipelineNoCache
is implementeddebug
flag, as we did previously inv1
. I do not see it as a blocker for this PR, but I'd propose to scope it as a part of the next sprint.Testing
Both
pytest tests/deepsparse/v2/unit/
pytest tests/deepsparse/v2/integration_tests/
are green.