Open Blaizzy opened 5 months ago
Will take it for implementation! hope to meet the standards :)
Here are some extra details: @willccbb
@willccbb done ✅
Hey @willccbb, any update on this? Would be super helpful to have
@willccbb doesn't have the bandwidth.
This feature is now open and back in backlog.
Overview
The goal is to add support for efficient batch processing of inputs to the MLX-VLM library. This will allow users to process multiple images and text prompts simultaneously to generate corresponding outputs in a single batch, improving performance.
Use cases:
Note: Tag @Blaizzy for code reviews and questions.
Requirements
Support batched inputs:
Perform batch processing:
Generate batched outputs:
Error handling:
API design:
Documentation and examples:
Implementation
Testing
Delivery
By implementing this batch processing feature, MLX-VLM will provide users with the ability to efficiently process multiple inputs simultaneously, improving performance and usability of the library for various vision-language tasks.