A common use case with multimodal models is having extended text captions to go with images. For example, the dataset may be curated by scraping a google images search and storing (caption, title of page) or (caption, first sentence of page) pairs.
FityOne could naturally support these dataset types by providing a Caption(Label) field that stores an extended text string. The reason for the dedicated Label subclass would be to indicate to the App that such labels should be rendered as longer text descriptions on the image, not in the label "chin" below the image.
A common use case with multimodal models is having extended text captions to go with images. For example, the dataset may be curated by scraping a google images search and storing
(caption, title of page)
or(caption, first sentence of page)
pairs.FityOne could naturally support these dataset types by providing a
Caption(Label)
field that stores an extended text string. The reason for the dedicatedLabel
subclass would be to indicate to the App that such labels should be rendered as longer text descriptions on the image, not in the label "chin" below the image.For discussion.