Writing the step function for diffusion engine

Keeping it similar to vllm (at least for now), should basically have

Start Step
    |
Check Parallel Config
    |
Retrieve Cached Outputs
    |
Clear Outputs
    |
Check Remaining Steps
    |
    +-----------------------------+
    |                             |
   Yes                           No
    |                             |
Schedule Next Iteration       Process Final Outputs
    |                             |
Process Outputs                   |
    |                             |
Check for Scheduled               |
Outputs                       Return Outputs
    |                              |
    +-----------------------------+
    |                              |
   Yes                             No
    |                              |
Construct Execute Model        Process Final Outputs
Request
    |
Execute Model
    |
Update Cached Outputs
    |
Append Outputs
    |
Process Final Outputs
    |
Return Outputs

probably will change a lot of this as per the needs and things I find fit for diffusion pipeline, for now this is what I'm following atleast for the step() function

shauray8 / continuity

Writing the step function for diffusion engine #2