bill9800 / speech_separation

Include some core functions and model to handle speech separation
MIT License
153 stars 61 forks source link

fixed indexing issue, in model_AV_new.py #17

Open vitrioil opened 4 years ago

vitrioil commented 4 years ago

While slicing video input, we need to slice the last dimension i.e. people_num. During training the first dimension becomes the batch_size; hence slice [:, :, :, index] becomes wrong. This can be fixed using ellipsis. This error leads to incorrect model training because this will slice the embedding dimension and not the input face dimension.