hx173149 / C3D-tensorflow

C3D is a modified version of BVLC tensorflow to support 3D ConvNets.
MIT License
588 stars 262 forks source link

what is crop_mean.npy's role in program?How to compute it? #25

Closed perfectFeng closed 7 years ago

perfectFeng commented 7 years ago

Thanks for sharing this code. Always error when downloading crop_mean.npy from dropbox, so i want to write it by myself, As its name shows, i think it represents a mean, RGB?But i don't kown how to compute. Hope your help.

LiangXu123 commented 7 years ago

it's just a common way to do data normalize.if you are fimilar with caffe,there should be a corresponding mean.binaryproto file which used to normalize the input to [0,1] or [-1,1]. but in tensorflow we usually us tf.image.per_image_standardization(image) to do the job. in this particular case,because we use C3D pre-trained model converted from caffe so,that's what it is. and the download link is so far(2017.10.14) worked fine with me. if you are in China ,and you cannot get access to Google or dropbox,you can try lantern or baidu "老D博客".

hx173149 commented 7 years ago

Yes, @cc786537662 is right. And my crop_mean.npy is transferred from the caffe version's mean file.

LiangXu123 commented 7 years ago

hey, @hx173149 thank you for your job. as i mentioned in Pretrained model on sport1m #1 , from the link https://www.dropbox.com/sh/8wcjrcadx4r31ux/AAAkz3dQ706pPO8ZavrztRCca?dl=0 i can only get the pre-train model on UCF101 which finetuning from sports1M,but how can i have the pretrained model on the pure sport1m pre-train model which has not finetuning on UCF101? can you help me t with that? thank you so much!

hx173149 commented 7 years ago

Hi @cc786537662 I am sorry, #19 has explained why I can not show the sports-1M model. If you want to get more detail about our method about the transferring, you can add my QQ:458728037.

b-hakim commented 6 years ago

Hello,

I have written a small python code to calculate the mean of the dataset for training split1 as follow:

def calculate_ucf_101_mean(ucf_train_lst, num_frames=16, new_w_h_size=112):
    mean = np.zeros((num_frames, new_w_h_size, new_w_h_size, 3))
    count = 0

    with open(ucf_train_lst) as f:
        for line in f:
            vid_path = line.split()[0]
            start_pos = int(line.split()[1])
            lbl = int(line.split()[2])

            stack_frames = []

            for i in range(start_pos, start_pos+num_frames):
                img = cv2.imread(os.path.join(vid_path, "{:06}.jpg".format(i)))
                img = cv2.resize(img, (new_w_h_size, new_w_h_size))
                stack_frames.append(img)

            stack_frames = np.array(stack_frames)
            mean += stack_frames
            count += 1
    mean/=float(count)
    print mean
    return mean

mean_ucf101_16 = calculate_ucf_101_mean("trainlist01.txt")
np.save("crop_mean_16.npy", mean_ucf101_16)

I compared it to the crop_mean that you have provided and it is different. Any reason why ?

laura-wang commented 6 years ago

@b-safwat I guess the "new_w_h_size=112"is wrong. You should first resize the image to [128, 171] and then crop them to be 112. Hope this is helpful.

b-hakim commented 6 years ago

@laura-wang why it is 128x171?

laura-wang commented 6 years ago

@b-safwat In the original C3D paper, the author say that they first resize images to 128 x 171, and then random crop to 112 x 112 . And when check the C3D-caffe code, it is doing the same thing when compute the mean.