TaeSoo-Kim / TCNActionRecognition

Skeleton based action recognition models with TCN variants for learning interpretable representation.
123 stars 33 forks source link

dont understand the process of MEAN SUBTRACTION #5

Closed FesianXu closed 6 years ago

FesianXu commented 6 years ago

Hi, kim, i dont really understand the process of MEAN SUBTRACTION in the train.py, especially those codes:

## THIS IS MEAN SUBTRACTION
      x = value.reshape((max_len,feat_dim))
      nonzeros = np.where(np.array([np.sum(x[i])>0 for i in range(0,x.shape[0])])==False)[0] 
      # i dont understand what nonzeros stand for ?
      if len(nonzeros) == 0:
        last_time = 0
      else:
        last_time = nonzeros[0]
      x.setflags(write=1)
      x[:last_time] = x[:last_time] - train_x_mean

would you please describe how it work? Thanks

FesianXu commented 6 years ago

Also, i dont understand the max_len stand for. Does it stand for the temporal length of the video clip?

TaeSoo-Kim commented 6 years ago

Hi Fesian,

The mean subtraction and max_len are actually related. max_len is the length of the longest sample in the dataset. Note that all samples have variable length. max_len sets all samples to equal length of max_len (I think I set it to 300).

Then, for example, if you have a sample of length 200, then the time steps 0:200 are filled in with data, and 200:300 are zero-padded. The mean subtraction only subtracts the mean skeleton from 0:200 and leaves the zero padded sequence as is. Does this make sense?

Thanks, TK

FesianXu commented 6 years ago

Thanks for your response. Also, in the code

nonzeros = np.where(np.array([np.sum(x[i])>0 for i in range(0,x.shape[0])])==False)[0]

so i guess nonzeros stands for the index of the padding zeros ? if it is true, then i think it should be '!=' instead of >. Is it a bug here?