eluv-io / elv-clip-search

Empower content owners to search and generate clips based on machine learning tags. Collect user feedback on the quality of the search results and ML tags
https://core.v3.contentfabric.io/#/apps/Clip%20Search
2 stars 1 forks source link

Remove select #61

Open elv-nickB opened 1 year ago

elv-nickB commented 1 year ago

Our use cases don't depend on the select option, so we're best to use display_fields.

  1. Replace select=text with display_fields=all (or set display_fields the same as search_fields to get just the tags)...
  2. Now the fields will be in sources.fields instead of sources.document.text
elv-serban commented 1 year ago

We use 'select' in the content fabric APIs (e.g. GET .../meta/public/asset_metadata?select=title&select=title_type). I think that's an intuitive keyword for APIs though I agree in this case it's not selecting but rather 'including' some extras. In which case maybe include is a better option. In any case both select and include are commonly used conventions

elv-nickB commented 1 year ago

We'll still have it as an option, the display_fields is just a way to get back indexed fields directly. For the clip use-cases we don't really need the select like we do in Roar. For the ML use cases the select is a lot slower, I think because the metadata we're selecting comes from files instead of the content metadata.

elv-haoyu commented 1 year ago

We are not ready change to display=all immediately due to the intrinsic requirement of timestamp in tags QA component. Using select=text, the clip info returned as

 "sources": [
        {
          "document": {
            "end_time": 7106224,
            "start_time": 7099384,
            "text": {
              "Action Detection": [],
              "Celebrity Detection": [],
              "Landmark Recognition": [],
              "Logo Detection": [],
              "Object Detection": [
                {
                  "end_time": 7101136,
                  "start_time": 7101094,
                  "text": [
                    "a woman's shoe sits on a rug with a pair of shoes on it."
                  ]
                },
                {
                  "end_time": 7104806,
                  "start_time": 7104764,
                  "text": [
                    "a woman's shoe is on a rug ."
                  ]
                }
              ],
              "Optical Character Recognition": [],
              "Segment Labels": [
                {
                  "end_time": 7102080,
                  "start_time": 7097080,
                  "text": [
                    "Shoe",
                    "Sneakers"
                  ]
                },
                {
                  "end_time": 7107080,
                  "start_time": 7102080,
                  "text": [
                    "Shoe",
                    "Sneakers"
                  ]
                }
              ],
              "Speech to Text": []
            }
          },
          "prefix": "/video_tags/metadata_tags/0011/metadata_tags/shot_tags/tags[33]",
          "fields": {
            "f_end_time": [
              7106224
            ],
            "f_start_time": [
              7099384
            ]}

While in current display=all, we get

 "sources": [
        {
          "prefix": "/video_tags/metadata_tags/0011/metadata_tags/shot_tags/tags[33]",
          "fields": {
            "f_asset_type": [
              "primary"
            ],
            "f_asset_type_as_string": [
              "primary"
            ],
            "f_display_title": [
              "Till"
            ],
            "f_display_title_as_string": [
              "Till"
            ],
            "f_end_time": [
              7106224
            ],
            "f_object": [
              "a woman's shoe sits on a rug with a pair of shoes on it.",
              "a woman's shoe is on a rug ."
            ],
            "f_object_as_string": [
              "a woman's shoe sits on a rug with a pair of shoes on it.",
              "a woman's shoe is on a rug ."
            ],
            "f_segment": [
              "Shoe",
              "Sneakers"
            ],
            "f_segment_as_string": [
              "Shoe",
              "Sneakers"
            ],
            "f_start_time": [
              7099384
            ],
            "f_title_type": [
              "feature"
            ],
            "f_title_type_as_string": [
              "feature"
            ]
          }

which timestamp for tags and model track names are missing. Waiting for turning up display=all or another parameter to get the timestamp.