Open elv-haoyu opened 1 year ago
For this, I'm tempted to go back to the old solution using select=text
. Otherwise we will need to store a separate start/end field for EACH ml tag which I think is messy.
If we need to optimize we can publish the ml tags as metadata instead of a file. The fabric metadata retrieval is much more efficient than loading the json files.
There might be another solution that covers both these bases and doesn't require changing the tag to metadata, but I would need to think about it. Instinctively I'm gravitating towards using the select
.
The plan is to add new "json" fields for each of the ml tracks with suffix "_tag" e.g. "f_object_tag". These fields will not be searchable but will contain the start/end time and the associated text. Like so,
"fields":{
...
"f_object_tag":[{"end_time":1630464,"start_time":1630464,"text":["Dress"]},{"end_time":1633050,"start_time":1633050,"text":["Human face"]},{"end_time":1633050,"start_time":1633050,"text":["Suit"]},{"end_time":1633050,"start_time":1633050,"text":["Chair"]},{"end_time":1644061,"start_time":1634092,"text":["Suit"]},{"end_time":1635052,"start_time":1635052,"text":["Dress"]}]
...
}
More generally, the paths specified in the crawl config do not need to terminate at a string/int/float value anymore, if the field is indexed as json. These _tag fields will be obtained by setting index mode to "json" and removing the ".text" at the end of the crawl path so that the start/end time get picked up. e.g site_map.searchables.*.video_tags.metadata_tags.*.metadata_tags.shot_tags.tags.text.Object Detection.text
-> site_map.searchables.*.video_tags.metadata_tags.*.metadata_tags.shot_tags.tags.text.Object Detection
Team: We really need to add the timestamp where the tag is found to the advanced side panel in clip search