e-apostolidis / PGL-SUM

A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization" (IEEE ISM 2021)
Other
86 stars 32 forks source link

user summary missing #2

Closed Harsha0621 closed 2 years ago

Harsha0621 commented 2 years ago

Hello i cannot find the user summary binary vectors mentioned in the readme in the h5 file

mpalaourg commented 2 years ago

Hello @Harsha0621 and thank you for your interest in our work.

I don't exactly understand your question, but I'll guess you are referring to either gtsummary or user_summary. To read those, you must download the data folder and unrar the two datasets. Then, you can access them via the following script:

import h5py
import numpy as np

dataset = "SumMe"  # Specify TVSum or SumMe
file_path = f"../data/{dataset}/eccv16_dataset_{dataset.lower()}_google_pool5.h5"  # your file path to datasets
data = h5py.File(file_path, "r")
for video in data.keys():
    gtsummary = np.array(data[video]["gtsummary"])
    user_summary = np.array(data[video]["user_summary"])

    print(gtsummary.shape, user_summary.shape)

data.close()

If I understand your question wrongly, feel free to ask again!

Harsha0621 commented 2 years ago

Thank you for your reply actually I wanted to know how to convert annotations into binary vectors in user summary

On Fri, Dec 24, 2021, 16:56 George Balaouras @.***> wrote:

Hello @Harsha0621 https://github.com/Harsha0621 and thank you for your interest in our work.

I don't exactly understand your question, but I'll guess you are referring to either gtsummary or user_summary. To read those, you must download the data folder and unrar the two datasets. Then, you can access them via the following script:

import h5pyimport numpy as np dataset = "SumMe" # Specify TVSum or SumMefile_path = f"../data/{dataset}/eccv16dataset{dataset.lower()}_google_pool5.h5" # your file path to datasetsdata = h5py.File(file_path, "r")for video in data.keys(): gtsummary = np.array(data[video]["gtsummary"]) user_summary = np.array(data[video]["user_summary"])

print(gtsummary.shape, user_summary.shape)

data.close()

If I understand your question wrongly, feel free to ask again!

— Reply to this email directly, view it on GitHub https://github.com/e-apostolidis/PGL-SUM/issues/2#issuecomment-1000801379, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP7RWZXZS6D6DIVH4MCSTW3USRKFLANCNFSM5KVETLDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

mpalaourg commented 2 years ago

If you have your annotations as frame-level importance scores (like in our work), you'll need a temporal segmentation of the video (i.e. KTS shots). The KTS shots in the h5 files are located in the change_points argument. Then, the shot-level importance scores are the average of the frames' importance scores inside each shot. Finally, you must choose (i.e. binary form) the shots of the summary based on a time budget. This final step is actually the knapsack problem.

Everything I described in words above, you can find it in code here. The time budget is 15% of the whole video. Most of the code is to translate the predicted frame-level importance scores from the sub-sampled (2 fps) video to the original video length.

mpalaourg commented 2 years ago

Closing due to inactivity. Feel free to open a new issue for furthermore questions and queries.