theSha1do1w commented 5 years ago

def load_frames(self, file_dir): frames = sorted([os.path.join(file_dir, img) for img in os.listdir(file_dir)]) frame_count = len(frames) buffer = np.empty((frame_count, self.resize_height, self.resize_width, 3), np.dtype('float32')) for i, frame_name in enumerate(frames): frame = np.array(cv2.imread(frame_name)).astype(np.float64) frame -= np.array([[[90.0, 98.0, 102.0]]])

frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        buffer[i] = frame

    # convert from [T, H, W, C] format to [C, T, H, W] (what PyTorch uses)
    # T = Time, H = Height, W = Width, C = Channels
    buffer = buffer.transpose((3, 0, 1, 2))

    return buffer

in the load_frames why you minus [90,98,102]

jfzhang95 commented 5 years ago

It is the bgr mean of the dataset.

theSha1do1w commented 5 years ago

thx

jfzhang95 / pytorch-video-recognition

the problem in the load_frames #1

frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)