vocalpy / hybrid-vocal-classifier

a Python machine learning library for animal vocalizations and bioacoustics
http://hybrid-vocal-classifier.readthedocs.io
BSD 3-Clause "New" or "Revised" License
23 stars 8 forks source link

Bug with hvc.predict() for Koumura data format #65

Closed URSUroman closed 6 years ago

URSUroman commented 6 years ago

Hello,

I run hvc.extract() and hvc.select() on the attached data set in the Koumura format and it was ok. When I run hvc.predict() on the same data set as for training (I use svm throughout), I get the following output on the terminal:

... Processing audio file 19 of 21. Processing audio file 20 of 21. C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\hvc\audiofileIO.py:814: User Warning: Segment 65 in 8.wav with label - not long enough for window function se t with current spect_params. spect will be set to nan. .format(ind, self.filename, label)) Processing audio file 21 of 21. C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\hvc\audiofileIO.py:814: User Warning: Segment 0 in 9.wav with label - not long enough for window function set with current spect_params. spect will be set to nan. .format(ind, self.filename, label)) predicting labels for features in file: features_from_HVC_Koumura_shortcreated 180220_173117 Traceback (most recent call last): File "", line 1, in File "C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\hvc\labelpredict.py" , line 93, in predict features_scaled = scaler.transform(features) File "C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\sklearn\preprocessin g\data.py", line 681, in transform estimator=self, dtype=FLOAT_DTYPES) File "C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\sklearn\utils\valida tion.py", line 453, in check_array _assert_all_finite(array) File "C:\ProgramData\Anaconda3\envs\hvc\lib\site-packages\sklearn\utils\valida tion.py", line 44, in _assert_all_finite " or a value too large for %r." % X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float64 ').

I assume, the warnings are due to the fact that some segmented portions of the song are too short and are discarded which seems ok. However possibly while being discarded, some nan values are set somewhere which later generates an error and the prediction step doesn't work for me.

Best regards, Roman

Annotation.zip HVC_Koumura_short.zip Koumura.extract.svm.config.zip Koumura.select.svm2.config.zip Koumura.predict.svm.config.zip

NickleDave commented 6 years ago

Fixed by #66