Open adelavega opened 5 years ago
Makes sense 👍 , should be a semi-straightforward fix
On this note, I was comparing the results of FRAME_MODE
to using FrameSamplingFilter
at 1hz and feeding it to VisionAPI
. The results are basically about the same, except VideoIntelligence
returns more features overall (probably different threshold).
So the main advantage is VideoIntelligence
can be much faster (if you feed it a video file in a manageable codec / size).
Actually, another advantage is that VideoIntellgience
returns category entities for each tag. This could be really useful, as many categories are super specific, but we might want to analyze at a slightly broader level (e.g. furniture
instead of chair
). We don't seem to currently extract that information.
In
FRAME_MODE
GoogleVideoIntelligence
returns results samped at 1hz (or so it seems, their docs don't say anything).However, pliers is attempting to add durations to these events based on the distances between offsets.
Example, for
chair
, the raw results look something like:But the df looks like:
The durations should all be 1 in this case.