Florence2 is a Vision Language Model designed for various tasks, including Region Description (basically object detection... captioning for a given part of an image). As an alternative to the current YOLO methods implemented in the repository, a Florence2 model fine-tuned on one of our available datasets will be implemented as well as functions for the user to more easily fine-tune the model based on similar YOLO datasets. To read more about the possible uses of Florence2, please consult the following links:
Florence2 is a Vision Language Model designed for various tasks, including Region Description (basically object detection... captioning for a given part of an image). As an alternative to the current YOLO methods implemented in the repository, a Florence2 model fine-tuned on one of our available datasets will be implemented as well as functions for the user to more easily fine-tune the model based on similar YOLO datasets. To read more about the possible uses of Florence2, please consult the following links:
Blog
Paper