ashawkey / RAD-NeRF

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
MIT License
878 stars 153 forks source link

Fix potential bug of array index overflow #39

Closed fondoger closed 1 year ago

fondoger commented 1 year ago

Phenomenon

When training an audio wav file, sometimes it fails with an error output message:

RuntimeError: The expanded size of the tensor (50) must match the existing size (54) at non-singleton dimension 0. Target sizes: [50, 44]. Tensor sizes: [54, 44]

image

# record the feats efficiently.. (no concat, constant memory)
start = self.feat_buffer_idx * self.context_size
end = start + feats.shape[0]
self.feat_queue[start:end] = feats   # <<<<<<<<<<<<<<<<<<<<<<<<<< ERROR STACK
self.feat_buffer_idx = (self.feat_buffer_idx + 1) % self.feat_buffer_size

Root cause:

When self.terminated is true, the return value will occasionally be greater than context_size (50).

def frame_to_text(self, frame):
    inputs = self.processor(frame, sampling_rate=self.sample_rate, return_tensors="pt", padding=True)
    with torch.no_grad():
        result = self.model(inputs.input_values.to(self.device))
        logits = result.logits # [1, N - 1, 32]

    # cut off stride
    left = max(0, self.stride_left_size)
    right = min(logits.shape[1], logits.shape[1] - self.stride_right_size + 1) # +1 to make sure output is the same length as input.
    # do not cut right if terminated.  
    if self.terminated:
        right = logits.shape[1]   # <<<<<<<<<<<<<<<<<<<<<<<<<<  ROOT CAUSE 
    logits = logits[:, left:right]

    ...

Fix When the iteration is finished (self.terminated==true), both self.feat_queue and self.feat_buffer_idx are of no use at all. We just simply add an if else to avoid this issue.

ashawkey commented 1 year ago

Thanks for your contribution!

2453995079 commented 5 months ago

File "/data/character/digital_people/RAD-NeRF/nerf/asr.py", line 420, in asr.run() File "/data/character/digital_people/RAD-NeRF/nerf/asr.py", line 362, in run self.run_step() File "/data/character/digital_people/RAD-NeRF/nerf/asr.py", line 223, in run_step self.feat_queue[start:end] = feats RuntimeError: The expanded size of the tensor (32) must match the existing size (44) at non-singleton dimension 1. Target sizes: [30, 32]. Tensor sizes: [30, 44] I have the same question The code has been improved, but still reporting this issue