vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.
In #94 we discussed features that should be ported over from Cineast. One of these features was the SkeletonPose feature that can be used to estimate poses in images and use them for retrieval.
However, the current implementation has several shortcomings:
It relies on TensorFlow models that must be bundled with the binary.
It relies on some arcana distance function, that can currently not be supported by vitrivr-engine's query execution engine let alone some of the storage engines.
The goal of this issue is therefore to replace the SkeletonPose with the following goals.
Use state-of-the-art techniques to estimate and encode skeleton poses in a vector such that this vector can be used for similarity search.
Assuming, that such a technique relies on machine learning models, integrate these in tho the feature extraction server or TorchServer (whatever is more suitable).
Dependencies
In light of #19, we ought to check, whether this should be part of the FES infrastructure or TorchServe.
Task Description
In #94 we discussed features that should be ported over from Cineast. One of these features was the
SkeletonPose
feature that can be used to estimate poses in images and use them for retrieval.However, the current implementation has several shortcomings:
vitrivr-engine
's query execution engine let alone some of the storage engines.The goal of this issue is therefore to replace the
SkeletonPose
with the following goals.Dependencies
In light of #19, we ought to check, whether this should be part of the FES infrastructure or TorchServe.
Boundary Conditions
None