Open 123liluky opened 4 years ago
yes, in order to have fixed size tensors as inputs for CNNs.
hi Tseng thanks for your repo. X,y-> What is the size of the X here .. as per your conv1 structure in EncoderCNN, the no of dim of input should be 4 but you are passing 5 dim. dint understand this
for t in range(x_3d.size(1)):
# CNNs
x = self.conv1(x_3d[:, t, :, :, :])
x = self.conv2(x)
x = self.conv3(x)
x = self.conv4(x)
x = x.view(x.size(0), -1)
In UCF101_ResNetCRNN.py: begin_frame, end_frame, skip_frame = 1, 29, 1 selected_frames = np.arange(begin_frame, end_frame, skip_frame).tolist() train_set, valid_set = Dataset_CRNN(data_path, train_list, train_label, selected_frames, transform=transform), Dataset_CRNN(data_path, test_list, test_label, selected_frames, transform=transform)
So, you just use the first 28 images in a video folder to train model? The left images are not used. Am I right?