google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.4k stars 5.15k forks source link

legacy v.s. new task-based/v0.10.0 hands models and pipeline #4488

Open matanox opened 1 year ago

matanox commented 1 year ago

Description of issue (what needs changing)

No response

Clear description

The release notes beg an elaboration regarding the hands pipeline

Correct links

No response

Parameters defined

No response

Returns defined

No response

Raises listed and defined

No response

Usage example

No response

Request visuals, if applicable

No response

Submit a pull request?

No response

matanox commented 1 year ago

Mentioning that the hand models have been updated, it would be nice to add what's changed, as the model card looks quite the same as the model card obtained during the versions of this pipeline preceding version 0.10.0 and the new tasks oriented api. The model card is still dated to 2 March 2020, so are the models included in this pipeline actually the same ones as in earlier versions?

As a pointwise matter, saying there that world landmarks are now using world coordinates also begs some delineation, as these kind of aspects have been the topic of long threads in the past; On first look it seems this property was already there prior to version 0.10.0 and maybe just carried over to the latest release notes by mistake (?).

There's also no longer the choice of whether to use the light or full model in the new v0.10.0 predictor instantiation api; I gather here that the "task" file that needs to be downloaded separately from the pip install contains just one model but can't be sure, its content seems opaque.

Please also consider delineating how detection, presence and tracking confidences are applied in the overall process, as well as a description of how they relate to the "legacy" versions' smaller set of only two threshold arguments; Is it just exposing an extra threshold which was hard-wired to some value in the older versions? how do the detection and presence thresholds actually interrelate?

I also notice that the old python legacy API preceding the tasks oriented API, is still working when installing and importing this release.

Thanks for adding the new asynchronous callback api option.


Some of the questions here may actually relate to the introduction of the tasks api along the last few releases, not to version 0.10.0 per-se, and yet it would be very nice for a clear answer to emerge to clarify the functional upgrades across these last few releases as it relates to the hands pipeline.