Migrate Kaldi tests - Githubissues

pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

https://pytorch.org/audio

BSD 2-Clause "Simplified" License

2.43k stars 636 forks source link

Migrate Kaldi tests #597

Open vincentqb opened 4 years ago

vincentqb commented 4 years ago

We used to test kaldi against a saved kaldi output here, but we now have an infrastructure to run kaldi in our test environment here.

We would like to migrate the tests from "saved kaldi" to "live kaldi".

[ ] test_spectrogram
[ ] test_fbank
[ ] test_mfcc
[ ] test_mfcc_empty
[ ] test_resample_waveform
[ ] test_resample_waveform_upsample_size
[ ] test_resample_waveform_downsample_size
[ ] test_resample_waveform_identity_size
[ ] test_resample_waveform_downsample_accuracy
[ ] test_resample_waveform_upsample_accuracy
[ ] test_resample_waveform_multi_channel

vincentqb commented 4 years ago

We should also have CI catch errors like #613 on gpu

bhargavkathivarapu commented 4 years ago

@vincentqb , For below kaldi compliance tests, Do we need to write a compatibility test cases for each saved files configurations(function args in filenames) in test/kaldi or one test configuration is enough for each

test_spectorgram
test_fbank ( one configuration already there in compatibility tests )
test_mfcc
test_resample_waveform

For remaining tests , they are not dependent on the saved kaldi files in test/kaldi ( though they are referencing a external .wav file , they are not comparing its contents ) , we can directly move them to compatibility without much changes

vincentqb commented 4 years ago

The tests should cover the same cases. Is that what you meant?

bhargavkathivarapu commented 4 years ago

The tests should cover the same cases. Is that what you meant?

@vincentqb yes , Like for test_fbank there are 97 kaldi files , each file name is variation of kaldi fbank function arguments

Two example file names out of 97 files
- fbank-0.0939-4.5062-1.0625-0.6875-1841-true-479-5-0.84-true-true-true-true-true-true-true-false-1832-1824-1.0000-hanning.ark

- fbank-0.1660-1.7875-1.1250-0.5000-4999-true-1740-6-0.29-true-false-true-false-false-true-false-true-4587-2289-1.0000-povey.ark

So we need to keep all these 97 configurations( function argument variations) in a text file , In the test we read this text file, and for each configuration we generate kaldi output on the fly and compare with torch output Like this right ? (or instead of all 97 we can test for 1 configuration only ?)

vincentqb commented 4 years ago

So we need to keep all these 97 configurations( function argument variations) in a text file , In the test we read this text file, and for each configuration we generate kaldi output on the fly and compare with torch output

Yes, that is what I mean. We need to maintain the same coverage.