vitrivr / vitrivr-engine

vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.
https://vitrivr.org
MIT License
6 stars 3 forks source link

Extending Content Author to support more finegrained tagging #114

Closed faberf closed 1 month ago

faberf commented 2 months ago

Using the DescriptorAsContentTransformer, descriptors such as FileMetadata can be re-input into the pipeline as content, for instance in order to create a prompt for later captioning. So far, complex struct descriptors have been ignored, since they do not map onto content easily. In order to support the usecase of including exif metadata in prompts for image captioning I have made the following main contribution:

For instance, if the name of the operator is "exif_content_transformer" and the names of the subfields are "location" and "date" then the transformer will transform a descriptor into two text content elements, tag them both with "exif_content_transformer", tag the location content with "exif_content_transformer.location" and tag the date content with "exif_content_transformer.date".

We should have a discussion, if this is intended behaviour of our tagging system. If so, let us maybe rename ContentAuthorAttribute to ContentTagAttribute and set CONTENT_AUTHORS_KEY from contentSources (which was inconsistent anyways) to tagWhiteList or something similar. Maybe we might want to make this kind of "namespace" approach "operator.tag" more rigorous.

Additionally:

faberf commented 1 month ago

@lucaro Do you have any feedback on this?

lucaro commented 1 month ago

I did not get around to look at it yet, will do so as soon as I'm able.

faberf commented 1 month ago
image

Can somebody help me out with this test? Why am I getting failed tests on functionality I didn't touch?

faberf commented 1 month ago

image Can somebody help me out with this test? Why am I getting failed tests on functionality I didn't touch?

Ah ok upon rerunning the tests they passed... spooky