Summary

Remove the check in text_generation/pipeline.py which disables continuous batching when internal kv cache is enabled
Update NLEngineOperator to use the engine provided as an input, if available, when using internal kv cache. This engine will have the model with the updated batch size, when continuous batching is being used (this is true in general for continuous batching)
Update step when engines are created within the continuous_batching_scheduler specific for the NLEngineOperator

neuralmagic / deepsparse