ruiyan1995 / Group-Activity-Recognition

A novel Participation-Contributed Temporal Dynamic Model for Group Activity Recognition
25 stars 6 forks source link

question #7

Closed NaifahNurya closed 5 years ago

NaifahNurya commented 5 years ago

Hellow, thank you for the updated version of the code.

Sorry for this question (not an issue). I found that, you use folder of the video + frame id of the targeted frame because other frame are not labelled (no bounding box and its corresponding label).

I f i have the bounding box and corresponding labels (action) of all frame, i thought there might be a possibility of using all frame instead of frameid (of targeted frame). Which part should i consider for this kind of modification in Processing.py and VDTrack.py? or other file (if any).

ruiyan1995 commented 5 years ago

Hellow, thank you for the updated version of the code.

Sorry for this question (not an issue). I found that, you use folder of the video + frame id of the targeted frame because other frame are not labelled (no bounding box and its corresponding label).

I f i have the bounding box and corresponding labels (action) of all frame, i thought there might be a possibility of using all frame instead of frameid (of targeted frame). Which part should i consider for this kind of modification in Processing.py and VDTrack.py? or other file (if any).

The main functions of Pre-Processing are tracking, ranking by MI and generating the train-test records. In your case, you don't need to track persons by OpenCV, just use existing bounding boxes as you mentioned. Did you mean that the action labels and group activity labels are varying with frames? It is hard to be solved by the existing framework.

NaifahNurya commented 5 years ago

Thank you for the reply,

Did you mean that the action labels and group activity labels are varying with frames? It is hard to be solved by the existing framework

For the above question, what I mean is, the data is in the same format as VD dataset. except these datasets have bounding box + action (for each player) and the group activity label for all frame found in a video. Each frame has label and bounding box of each player and their action ie frameid, activity_label, (x,y,w,h)1, action1, (x,y,w,h)2, action2............

ruiyan1995 commented 5 years ago

Thank you for the reply,

Did you mean that the action labels and group activity labels are varying with frames? It is hard to be solved by the existing framework

For the above question, what I mean is, the data is in the same format as VD dataset. except these datasets have bounding box + action (for each player) and the group activity label for all frame found in a video. Each frame has label and bounding box of each player and their action ie frameid, activity_label, (x,y,w,h)1, action1, (x,y,w,h)2, action2............

I see, but in VD, each person during T frames shares one label. If the action label of each person is changing in a short clip (T=10) in your dataset, how to process it?

NaifahNurya commented 5 years ago

The idea is the same as what you write, each person eg for (T=10), share the same action label. ie player1 with bounding box (x,y,w,h) has the same action label for all 10 frames. The same applied to player2 with bounding box (x,y,w,h) has the same action label for all ten frames,

Except what we have here, is all bounding box for each frame and action label for each player and for each frame.

However, in VD, we have 20frmaes before the target frame + target frame + 20 frames after the target frame. Only target frame are labeled (frame id, activity label, bounding box, action label, )

ruiyan1995 commented 5 years ago

The idea is the same as what you write, each person eg for (T=10), share the same action label. ie player1 with bounding box (x,y,w,h) has the same action label for all 10 frames. The same applied to player2 with bounding box (x,y,w,h) has the same action label for all ten frames,

Except what we have here, is all bounding box for each frame and action label for each player and for each frame.

However, in VD, we have 20frmaes before the target frame + target frame + 20 frames after the target frame. Only target frame are labeled (frame id, activity label, bounding box, action label, )

I think you can just crop the person images by your annotation without any other tracking algorithm. The label for action or activity is as same as before.

NaifahNurya commented 5 years ago

So I have to ignore Track.py under Pre folder.