Closed vinaysworld closed 6 years ago
Hi, I think you are using python3 so dictionary.keys() could not be indexed. What you can do is to replace line63 by
#dataset_keys = dataset.keys()
dataset_keys = list(dataset.keys())
Thanks for spotting this. I have updated this in the scripts. Let me know if you have further issues.
Hi, I have completed all the process and got the log file which saying final outcome. but as i studied in your paper, you had used video 18 in TVSum dataset for generating summary or importance score, can i also do this. As I am student and new one in neural networks, can you give the way, how I will give the video as input and got the important frames or summarized frames, and all the necessary graphs. Thank you
If you want to get the raw output of the summarization network (i.e. importance scores, Fig.3 in our paper), you would need to save the output of probs = net.model_inference(data_x)
in this line (e.g. to a new h5 file).
To get the summarized frames (Fig.2 in our paper), you can save machine_summary = vsum_tools.generate_summary(probs, cps, n_frames, nfps, positions)
(this line). machine_summary
is a binary vector indicating which frames are included in the summary.
Hope this would help.
Im not sure I understand when you said save machine_summary = vsum_tools.generate_summary(probs, cps, n_frames, nfps, positions).
Is there a provision to give one video file as input via command line? Could you please elaborate this
To get the summarized frames (Fig.2 in our paper), you can save machine_summary = vsum_tools.generate_summary(probs, cps, n_frames, nfps, positions) (this line). machine_summary is a binary vector indicating which frames are included in the summary.
@AdarshMJ
For example, the input video has 5 frames, machine_summary = [0, 0, 1, 1, 0]
where positions with value 1 mean these frames are keyframes (summary). You need to manually pick those frames (in this example, frame3 and frame4).
Thank you so much. I got it. I wanted to know whether those indices actually represent the frame numbers or they are just indices?
@AdarshMJ
The values represent whether the frames are selected. To find those indices, you can do sth like summary.nonzero()
. I have just updated the code to include the visualization tool, so you can visualize the score-vs-gtscore.
Thank you so much. I will check out. I wanted to know how to create the h5 file for my own dataset? I checked out the links and readme.txt. But there are so many parameters that has to be included for the h5 file to be created, like gt scores and all that. is it possible to update your code for generating this data? That would be helpful.
On 21 Apr 2018, 8:20 PM +0530, Kaiyang notifications@github.com, wrote:
@AdarshMJ The values represent whether the frames are selected. To find those indices, you can do sth like summary.nonzero(). I have just updated the code to include the visualization tool, so you can visualize the score-vs-gtscore. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
It might be unnecessary to do this because the way you generate the image features and ground truth score/summary would be different and dependent on your purpose. You can follow this after you got those data.
Okay got it.
f.create_dataset(name + '/features', data=data_of_name) f.create_dataset(name + '/gtscore', data=data_of_name) f.create_dataset(name + '/user_summary', data=data_of_name) f.create_dataset(name + '/change_points', data=data_of_name) f.create_dataset(name + '/n_frame_per_seg', data=data_of_name) f.create_dataset(name + '/n_frames', data=data_of_name) f.create_dataset(name + '/picks', data=data_of_name) f.create_dataset(name + '/n_steps', data=data_of_name) f.create_dataset(name + '/gtsummary', data=data_of_name) f.create_dataset(name + '/video_name', data=data_of_name) Of all these parameters can i just have video name, number of frames and features as part of the data? Or all the parameters are necessary? Which parameters are necessary to be included?
On 21 Apr 2018, 8:44 PM +0530, Kaiyang notifications@github.com, wrote:
It might be unnecessary to do this because the way you generate the image features and ground truth score/summary would be different and dependent on your purpose. You can follow this after you got those data. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
If you just wanna train policy network, you will need features only. Pls double check the vsum_train
I checked out your vsum_train.py code. In these lines
for index in indices:
key = dataset_keys[index]
data_x = dataset[key]['features'][...].astype(_DTYPE)
L_distance_mat = cdist(data_x, data_x, 'euclidean')
L_dissim_mat = 1 - np.dot(data_x, data_x.T)
if ignore_distant_sim:
inds = np.arange(data_x.shape[0])[:,None]
inds_dist = cdist(inds, inds, 'minkowski', 1)
L_dissim_mat[inds_dist > distant_sim_thre] = 1
rewards = net.model_train(data_x, learn_rate, L_dissim_mat, L_distance_mat, blrwds[key])
blrwds[key] = 0.9 * blrwds[key] + 0.1 * rewards.mean()
epoch_reward += rewards.mean()
This means the training is done using only the features which have been extracted from the videos right? It is not taking into account the rest of the parameters like gtscore, user_summary and all that?
@AdarshMJ Yes, only the features are needed for training.
Hi, Can you help me in this line, what modification should i do? Traceback (most recent call last): File "vsum_train.py", line 155, in
train_dataset_path=args.dataset)
File "vsum_train.py", line 83, in train
key = dataset_keys[index]
TypeError: 'KeysView' object does not support indexing