Create Florence 2 model using the baseball_rubber_home_glove dataset.

Florence2 is a Vision Language Model designed for various tasks, including Region Description (basically object detection... captioning for a given part of an image). As an alternative to the current YOLO methods implemented in the repository, a Florence2 model fine-tuned on one of our available datasets will be implemented as well as functions for the user to more easily fine-tune the model based on similar YOLO datasets. To read more about the possible uses of Florence2, please consult the following links:

Blog

Paper

dylandru / BaseballCV

Create Florence 2 model using the baseball_rubber_home_glove dataset. #16