Closed bagustris closed 7 months ago
I guess it would really help if nkululeko would output the exact name of the audio file that caused the error. I'll try to add that
can you send a wav file that causes the error?
I used asvp-esd dataset,
https://zenodo.org/records/7132783
Instructions are available in data/asvp-esd
.
Please try again with version 0.68.1
Now it gives information on which files are causing errors. Here the output with INI file above.
...
100%|██████████████████████████████████████▉| 8780/8786 [32:28<00:01, 4.17it/s]error on file ././data/asvp-esd//ASVP-ESD-Update/Audio/actor_99/03-02-02-01-03-99-02-02-02-02.wav: mean requires at least one data point
100%|███████████████████████████████████████| 8786/8786 [32:29<00:00, 4.51it/s]
Warning: 3870 Nans in x, replacing with 0
Warning: 96646 infinite in x
Now extracting speech rate parameters...
0%| | 8/8786 [00:00<01:58, 74.39it/s]caught zero division
caught zero division
caught zero division
0%| | 20/8786 [00:00<01:30, 96.50it/s]/home/bagus/github/nkululeko/nkululeko/feat_extract/feinberg_praat.py:514: PraatWarning: The loudest and softest part in your sound differ by only 9.009522902670497 dB.
textgrid = call(
caught zero division
0%|▏ | 30/8786 [00:00<02:10, 67.15it/s]
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/bagus/github/nkululeko/nkululeko/nkululeko.py", line 72, in <module>
main(
File "/home/bagus/github/nkululeko/nkululeko/nkululeko.py", line 57, in main
expr.extract_feats()
File "/home/bagus/github/nkululeko/nkululeko/experiment.py", line 344, in extract_feats
self.feats_train = self.feature_extractor.extract()
File "/home/bagus/github/nkululeko/nkululeko/feature_extractor.py", line 164, in extract
self.featExtractor.extract()
File "/home/bagus/github/nkululeko/nkululeko/feat_extract/feats_praat.py", line 34, in extract
self.df = feinberg_praat.compute_features(self.data_df.index)
File "/home/bagus/github/nkululeko/nkululeko/feat_extract/feinberg_praat.py", line 453, in compute_features
df_speechrate = get_speech_rate(file_index)
File "/home/bagus/github/nkululeko/nkululeko/feat_extract/feinberg_praat.py", line 485, in get_speech_rate
speechrate_dictionary = speech_rate(sound)
File "/home/bagus/github/nkululeko/nkululeko/feat_extract/feinberg_praat.py", line 567, in speech_rate
currenttime = timepeaks[0]
IndexError: list index out of range
please try again with version 0.68.3
It now works with the outputs below.
...
DEBUG feature_extractor: praat: shape : (2214, 39)
DEBUG experiment: All features: train shape : (8786, 39), test shape:(2214, 39)
DEBUG experiment: scaler: False
DEBUG runmanager: value for runs not found, using default: 1
DEBUG runmanager: run 0
DEBUG model: value for C_val not found, using default: 0.001
DEBUG modelrunner: value for epochs not found, using default: 1
DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.266 UAR
DEBUG modelrunner: plotting confusion matrix to train_test_svm_praat__0_000_cnf
DEBUG reporter: result per class (F1 score): [0.0, 0.0, 0.119, 0.047, 0.627, 0.394, 0.4, 0.223, 0.0, 0.468, 0.421]
DEBUG experiment: Done, used 51.386 seconds
DONE
it seems many files have problems with this dataset
Even though already using
check_size
for file checker, I still got the following error on usingpraat
features. The error did not show when I changed the feature toos
.This could be related to how the features inside Praat are calculated. An example is with ASVP-ESD dataset with the following INI file.