pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.44k stars 636 forks source link

Fix vad to return zero output for zero input #3685

Closed wasd96040501 closed 8 months ago

wasd96040501 commented 8 months ago

fix #3668 Return empty result when not has_triggered. Zeroed wav audio added to unit test.

pytorch-bot[bot] commented 8 months ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3685

Note: Links to docs will display an error until the docs builds have been completed.

:x: 13 New Failures, 2 Unrelated Failures

As of commit 451ee170d0ea20573de574472f497eb5ee19d8af with merge base 36f5010b9b61092cf03d2e5f39708b1ec2f0caeb (image):

NEW FAILURES - The following jobs have failed:

* [Build Linux Conda / pytorch/audio / conda-py3_8-cpu](https://hud.pytorch.org/pr/pytorch/audio/3685#18487811893) ([gh](https://github.com/pytorch/audio/actions/runs/6800069780/job/18487811893)) * [Build Linux Conda / pytorch/audio / conda-py3_8-cuda11_8](https://hud.pytorch.org/pr/pytorch/audio/3685#18487812287) ([gh](https://github.com/pytorch/audio/actions/runs/6800069780/job/18487812287)) * [Build Linux Conda / pytorch/audio / conda-py3_8-cuda12_1](https://hud.pytorch.org/pr/pytorch/audio/3685#18487812831) ([gh](https://github.com/pytorch/audio/actions/runs/6800069780/job/18487812831)) * [Build M1 Conda / pytorch/audio / conda-py3_8-cpu](https://hud.pytorch.org/pr/pytorch/audio/3685#18487812288) ([gh](https://github.com/pytorch/audio/actions/runs/6800069759/job/18487812288)) * [Build MacOS Conda / pytorch/audio / conda-py3_8-cpu](https://hud.pytorch.org/pr/pytorch/audio/3685#18487815824) ([gh](https://github.com/pytorch/audio/actions/runs/6800069763/job/18487815824)) * [Build Windows Conda / pytorch/audio / conda-py3_8-cpu](https://hud.pytorch.org/pr/pytorch/audio/3685#18487812290) ([gh](https://github.com/pytorch/audio/actions/runs/6800069782/job/18487812290)) * [Build Windows Conda / pytorch/audio / conda-py3_8-cuda12_1](https://hud.pytorch.org/pr/pytorch/audio/3685#18487813356) ([gh](https://github.com/pytorch/audio/actions/runs/6800069782/job/18487813356)) * [Unit-tests on Linux CPU / tests (3.10) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529504747) ([gh](https://github.com/pytorch/audio/actions/runs/6800069756/job/18529504747)) * [Unit-tests on Linux CPU / tests (3.8) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529504078) ([gh](https://github.com/pytorch/audio/actions/runs/6800069756/job/18529504078)) * [Unit-tests on Linux CPU / tests (3.9) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529504426) ([gh](https://github.com/pytorch/audio/actions/runs/6800069756/job/18529504426)) * [Unit-tests on Linux GPU / tests (3.10, 11.8) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529451810) ([gh](https://github.com/pytorch/audio/actions/runs/6800069800/job/18529451810)) * [Unit-tests on Linux GPU / tests (3.8, 11.8) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529452140) ([gh](https://github.com/pytorch/audio/actions/runs/6800069800/job/18529452140)) * [Unit-tests on Linux GPU / tests (3.9, 11.8) / linux-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529452492) ([gh](https://github.com/pytorch/audio/actions/runs/6800069800/job/18529452492))

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

* [Build Windows Conda / pytorch/audio / conda-py3_8-cuda11_8](https://hud.pytorch.org/pr/pytorch/audio/3685#18487812850) ([gh](https://github.com/pytorch/audio/actions/runs/6800069782/job/18487812850)) * [Unit-tests on Macos CPU / tests / macos-job](https://hud.pytorch.org/pr/pytorch/audio/3685#18529508596) ([gh](https://github.com/pytorch/audio/actions/runs/6800069749/job/18529508596))

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot commented 8 months ago

Hi @wasd96040501!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

wasd96040501 commented 8 months ago

Hi @mthrok , I have implemented the changes, please take a look😊

wasd96040501 commented 8 months ago

Hi, @mthrok It seems build conda failed. What can I do to fix this?

mthrok commented 8 months ago

Hi, @mthrok It seems build conda failed. What can I do to fix this?

@wasd96040501 This PR has nothing to do with build, and build jobs often fail for external reasons, so we don't need to worry about it.

FYI: I edited the test you added a bit.

trangham283 commented 4 months ago

Hi, I think the current way to handle not has_triggered is wrong. It returns an empty tensor even if all the samples are non-zero. Here is a similar reproducible example to the one in https://github.com/pytorch/audio/issues/3668:

import torch
import torchaudio.functional as F
x = 10*torch.ones([16000])
y = F.vad(x, 16000)
print(f"input_size={x.size()}, ysize={y.size()}")

output: input_size=torch.Size([16000]), ysize=torch.Size([0])

I realize this is not the best example, as there is no speech in that array of all ones. Here is a sample that didn't trigger the flag even though there is speech in it, which led me to discover this.