vidpanos / vidpanos.github.io

33 stars 1 forks source link

Inference Code Release? #2

Open Jaykumaran opened 1 week ago

Jaykumaran commented 1 week ago

Hi team,

Great paper, looking forward for code release for a personal project and writing blog on LearnOpenCV

Jaykumaran commented 1 week ago

Can i get to know about a tentative date for code release

Jaykumaran commented 1 week ago

I'm trying to perform inference based on the given drive link

def read_video_from_frames(path: str, num_frames: int):
    frames = []
    for i in range(num_frames):
        full_path = f'{path}/frame_{i:04d}.jpg'
        im = cv2.imread(full_path)

        # Ensure the image is in RGB format
        if im is None:
            raise FileNotFoundError(f"Frame {full_path} not found.")
        if len(im.shape) == 2:  # If the image is grayscale
            im = cv2.cvtColor(im, cv2.COLOR_GRAY2RGB)  # Convert grayscale to RGB

        frames.append(im)

    # Stack frames into a numpy array with shape (T, H, W, C)
    video = np.stack(frames, axis=0)

    # Transpose to (C, T, H, W) and add batch dimension
    # video_tensor = torch.tensor(np.transpose(video, (3, 0, 1, 2))[np.newaxis, ...]).detach().to('cuda').float()
    video_tensor = torch.tensor(np.transpose(video, (3, 0, 1, 2))[np.newaxis, ...]).detach().float()

    return video_tensor
from hpvaegan_code.evaluate_sifid_our_samples import (
    calculate_activation_statistics, calculate_frechet_distance
)
from hpvaegan_code.C3D_model import C3D

dims = 256
block_idx = C3D.BLOCK_INDEX_BY_DIM[dims]
model = C3D(block_idx)
model.load_state_dict(torch.load(os.path.join(os.getcwd(), "c3d.pickle") , weights_only = True))
# model = model.cuda().eval() 
model = model.eval()
frames_path = "data/data/real/VID_20160326_121131_02/original"
num_frames = int(len(os.listdir(frames_path)))
print("Total frames in dir", num_frames)
video = read_video_from_frames(frames_path, num_frames)
print("Video rendered from frames", video.shape) #B,C,T,H,W
video = F.interpolate(video, size = (16,112,112), mode = 'trilinear', align_corners = False)
pred = model(video)[-1]

pred_arr = pred.cpu().data.numpy().transpose(0, 2, 3, 4, 1)

pred_arr = pred_arr / 1000

How to proceed further?

Jaykumaran commented 1 week ago

Hey Team,

Any updates about date or inference code?