cvlab-kaist / GaussianTalker

Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn and Seungryong Kim
Other
300 stars 36 forks source link

Bug: inference custom audio #20

Open luosiwu opened 6 months ago

luosiwu commented 6 months ago

推理目标音频的时候,结果会少一部分帧,是不是有bug呢

kyusuncho commented 6 months ago

Hi, could you provide some details of which frames you are missing?

luosiwu commented 6 months ago

iterations = process_until // batch_size if process_until % batch_size != 0: iterations += 1 total_time = 0

render image

for idx in tqdm(range(iterations), desc="Rendering progress",total = iterations):
    try:
        viewpoint_cams = next(loader)
        output = render_from_batch(viewpoint_cams, gaussians, pipeline, 
                                random_color= False, stage='fine',
                                batch_size=batch_size, visualize_attention=False, only_infer=True)
    except:
        break
    total_time += output["inference_time"]
    image.append(output["rendered_image_tensor"].cpu())
    gt.append(output["gt_tensor"].cpu())

我debug了下,看起来是batch_size的原因,batch_size=1就没有该问题,当然更好的方式是优化一下代码

lokvke commented 5 months ago

iterations = process_until // batch_size if process_until % batch_size != 0: iterations += 1 total_time = 0 #render image for idx in tqdm(range(iterations), desc="Rendering progress",total = iterations): try: viewpoint_cams = next(loader) output = render_from_batch(viewpoint_cams, gaussians, pipeline, random_color= False, stage='fine', batch_size=batch_size, visualize_attention=False, only_infer=True) except: break total_time += output["inference_time"] image.append(output["rendered_image_tensor"].cpu()) gt.append(output["gt_tensor"].cpu()) 我debug了下,看起来是batch_size的原因,batch_size=1就没有该问题,当然更好的方式是优化一下代码

请问推理的效果和速度怎么样