weirme / FCSN

A PyTorch reimplementation of FCSN in paper "Video Summarization Using Fully Convolutional Sequence Networks"
116 stars 33 forks source link

implementation of get_oracle_summary function #15

Open pcshih opened 5 years ago

pcshih commented 5 years ago

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L70

What is the meaning of this function?

weirme commented 5 years ago

This function generates a summary from summary of 20 users in the dataset.

pcshih commented 5 years ago

The implementation is based on which paragraph of FSCN paper or other paper?

weirme commented 5 years ago

Chapter 3.1 of this paper: Diverse sequential subset selection for supervised video summarization. In my implementation, the greedy algorithm selects the frame marked by the most users each time.

pcshih commented 5 years ago

After reading Chapter 3.1, I still cannot realize the process. Given 3 human summaries with 5 frames: A: [1,0,1,1,0] B: [0,0,1,0,0] C: [0,0,0,1,0]

How to get the final summary? First: calculate the select times of each frame -> [1,0,2,2,0] Second: I have no idea...

weirme commented 5 years ago

In my implementation, initialize oracle summary as [0, 0, 0, 0, 0], and then pick the most selected frame (here the third), now the oracle summary will be [0, 0, 1, 0, 0]. Determine if the F-score between oracle summary and user summary increases after adding this frame. If true, continue to select next frame, otherwise it ends. But it is just my implementation, I didn't find a specific description of the greedy algorithm used in the paper. So I'm not sure if the algorithm is like this.

pcshih commented 5 years ago

Where is FCSN mentioned that they use "Diverse sequential subset selection for supervised video summarization" for generating a summary from summary of users?

weirme commented 5 years ago

This method is mentioned in supplementary materials of paper Video Summarization with Long Short-term Memory.

pcshih commented 5 years ago

After I read the paragraph, I implement it.

https://github.com/pcshih/pytorch-FCSN/blob/7d4f874f6c71d5b279b6e26a6ee4882460230fc9/make_dataset.py#L84

Is my understanding identical to yours?

But the performance is quite bad...

weirme commented 5 years ago

Have you print the final F-score between generated oracle summary and user summary?

pcshih commented 5 years ago

Did you mean the parameter "best_fscore"?

pcshih commented 5 years ago

best_fscore_1 best_fscore_2

It seems slightly different.

pcshih commented 5 years ago

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better.

But I do not know why to use this method...

FCSN_1D_summe_eval_max

pcshih commented 5 years ago

Could you share the tvsum video on your google drive? tvsum needs authorization....

weirme commented 5 years ago

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better.

But I do not know why to use this method...

FCSN_1D_summe_eval_max

Is this result on SumMe? It seems close to that in paper!

weirme commented 5 years ago

Could you share the tvsum video on your google drive? tvsum needs authorization....

Wait a moment, I'm now uploading it...

pcshih commented 5 years ago

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164 I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better. But I do not know why to use this method... FCSN_1D_summe_eval_max

Is this result on SumMe? It seems close to that in paper!

Yes, it is summe.

pcshih commented 5 years ago

Could you share the tvsum video on your google drive? tvsum needs authorization....

Wait a moment, I'm now uploading it...

Thank you

weirme commented 5 years ago

Here is the link.

pcshih commented 5 years ago

Got it. Thank you very much. Did you figure out ? https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

weirme commented 5 years ago

May be it is a default setting in evaluation? I also think it's strange... And I noticed that selected key frames of videos in summe differ greatly from each user, F-score between generated oracle summary and user summary is only nearly 50%, but that is nearly 70% in tvsum. In this case, getting a summary close to every user seems to be difficult. Is this probably a reason to select max?

pcshih commented 5 years ago

I agree with your opinion. Let's take this evaluation method for granted. I also implement this paper which architecture is based on FCSN but there are some problems...

weirme commented 5 years ago

I have not read this paper yet, its architecture looks complicated.

pcshih commented 5 years ago

Do you have any idea of FCSN in unsupervised version?

weirme commented 5 years ago

No... I skip that part when reading the paper...

pcshih commented 5 years ago

Shall we implement that part?

weirme commented 5 years ago

I will try to implement it after reading that part, but there may be some problems because my computer at home doesn't have a nvidia gpu :sweat_smile::sweat_smile:

pcshih commented 5 years ago

I am counting on you.

Pager07 commented 4 years ago

Here is the link.

Thanks for this.