vitrivr / vitrivr-engine

vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.
https://vitrivr.org
MIT License
6 stars 3 forks source link

Redesign of Persistence & Pipeline Design #55

Closed ppanopticon closed 6 months ago

ppanopticon commented 7 months ago

This branch is used to track changes related to new persistence model (see #52), in which Operators do no longer persist information themselves. Instead, a specialized PersistingSink takes care of persistence.

Change required for adjustments to general design:

ppanopticon commented 7 months ago

@lucaro and @sauterl: In light of our planned get-together I invited you as reviewers so that we can discuss the design changes.

ppanopticon commented 6 months ago

Since the two issues of persistence and pipeline designed cannot be disentangled completely, this PR now addresses both (#51 & #52). The following changes have been made to how extraction pipelines can be generated:

ppanopticon commented 6 months ago
{
  "schema": "V3C1",
  "context": {
    "contentFactory": "InMemoryContentFactory",
    "resolverName": "disk",
    "local": {
      "enumerator": {
        "path": "/Volumes/V3C1/V3C1/videos",
        "depth": "1"
      },
      "thumbs": {
        "path": "/Users/rgasser/Downloads/vitrivr-engine/images",
        "maxSideResolution": "350",
        "mimeType": "JPG"
      },
      "filter": {
        "type": "SOURCE:VIDEO" 
      }
    }
  },
  "operators": {
    "enumerator": { "type": "ENUMERATOR", "factory": "FileSystemEnumerator", "mediaTypes": ["VIDEO"]},
    "decoder": { "type": "DECODER", "factory": "VideoDecoder"  },
    "selector": { "type": "TRANSFORMER", "factory": "LastContentAggregator" },
    "avgColor": { "type": "EXTRACTOR", "fieldName": "averagecolor"},
    "file_metadata": { "type": "EXTRACTOR", "fieldName": "file" },
    "time_metadata": { "type": "EXTRACTOR", "fieldName": "time" },
    "video_metadata": { "type": "EXTRACTOR", "fieldName": "video" },
    "thumbs": { "type": "EXPORTER", "exporterName": "thumbnail" },
    "filter": { "type": "TRANSFORMER", "factory": "TypeFilterTransformer"}
  },
  "operations": {
    "enumerator": { "operator": "enumerator" },
    "decoder": { "operator": "decoder", "inputs": [ "enumerator" ] },
    "selector": { "operator": "selector", "inputs": [ "decoder" ] },
    "averagecolor": { "operator": "avgColor","inputs": ["selector"]},
    "thumbnails": {  "operator": "thumbs", "inputs": ["selector"] },
    "time_metadata": {  "operator": "time_metadata", "inputs": ["selector"] },
    "filter": {  "operator": "filter", "inputs": ["averagecolor", "thumbnails", "time_metadata"], "merge": "COMBINE" },
    "video_metadata": {  "operator": "video_metadata", "inputs": ["filter"] },
    "file_metadata": {  "operator": "file_metadata", "inputs": ["video_metadata"] }
  },
  "output": ["file_metadata"]
}

To get a feeling of the main features:

This basic example works on my machine. It iterates a folder of videos, decodes them with a 500ms multiplex window, selects the first content element for each emitted element, extracts some features, filters out the "source" element (for the file), extracts some file-related features and end emits this to the sink (where it is persisted).

All generated Descriptor and Relationship are kept in memory until persistence operations concludes.