A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
[X] I have searched the Inference issues and found no similar feature requests.
Question
Some LLM/LMM would be able to produce output in specific format. For instance - detecting bounding boxes.
We have now LMM block and LMMForClassification block made solely for the purpose of producing structured output, required for compatibility with other blocks.
This approach is not scalable, we should think what to do.
Initial idea:
LLM/LMM with predefined/configurable prompts causing specific output data to be created
conversion blocks taking text outputs converting it into specific data (example: sv.Detections)
Search before asking
Question
Some LLM/LMM would be able to produce output in specific format. For instance - detecting bounding boxes. We have now LMM block and LMMForClassification block made solely for the purpose of producing structured output, required for compatibility with other blocks. This approach is not scalable, we should think what to do.
Initial idea:
Additional
No response