@alpizano
I have successfully extracted position data from the YoloV3 Network. You can see it in this
'cropped_short.mp4.txt' text file
The format of the data is as follows
Point 1 x-value
Point 1 y-value
Point 2 x-value
Point 2 y-value
Class Index
ConfidenceThreshold
782
136
872
191
1
0.919052
12
299
49
373
0
0.760167
745
130
831
182
1
0.836964
47
199
98
259
0
0.753106
Since Point 1 and Point 2 define a rectangle, fine the boxes center by finding the midpoint of the line defined by P1 and P2.
Then find the distance between this midpoint and the previously identified midpoint for the same class this will give you the relative velocity between frames of the video. Find the difference between these velocities to gain acceleration.
Find a way to define this into a file format then encode into polar coordinates. Or transform into polar then perform the above calculations. These values will then be fed into another network (i.e. predictive network) for training.
@alpizano
Note, I changed the format. Now each line corresponds to a JSON array containing all detections from a specific frame. You can see an example here:
@alpizano I have successfully extracted position data from the YoloV3 Network. You can see it in this 'cropped_short.mp4.txt' text file
The format of the data is as follows
Since Point 1 and Point 2 define a rectangle, fine the boxes center by finding the midpoint of the line defined by P1 and P2.
Then find the distance between this midpoint and the previously identified midpoint for the same class this will give you the relative velocity between frames of the video. Find the difference between these velocities to gain acceleration.
Find a way to define this into a file format then encode into polar coordinates. Or transform into polar then perform the above calculations. These values will then be fed into another network (i.e. predictive network) for training.