bzamecnik / ml

Machine learning projects, often on audio datasets
MIT License
92 stars 39 forks source link

extract_features.py gives very weird error. #45

Closed JBloodless closed 6 years ago

JBloodless commented 6 years ago

Hi again! I've tried to run whole preprocessing, but stuck on extracting features. I prepare dataset in FLAC and run extract_features.py {AUDIO_DIR} {FEATURE_DIR} without any changes in code, and I got this:

Traceback (most recent call last): File "C:/dipl0m/ml-master/instrument-classification/extract_features.py", line 68, in <module> args.block_size, args.hop_size, args.bin_range, args.bin_division) File "C:/dipl0m/ml-master/instrument-classification/extract_features.py", line 36, in extract_pitchgrams for i in range(len(dataset.samples))) File "C:\Users\jackb\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\lib\shape_base.py", line 421, in dstack return _nx.concatenate([atleast_3d(_m) for _m in tup], 2) File "C:\Users\jackb\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\lib\shape_base.py", line 421, in <listcomp> return _nx.concatenate([atleast_3d(_m) for _m in tup], 2) File "C:/dipl0m/ml-master/instrument-classification/extract_features.py", line 36, in <genexpr> for i in range(len(dataset.samples))) File "C:\Users\jackb\AppData\Roaming\Python\Python36\site-packages\tfr\sklearn.py", line 32, in transform bin_division=self.bin_division) File "C:\Users\jackb\AppData\Roaming\Python\Python36\site-packages\tfr\reassignment.py", line 298, in pitchgram output_frame_size, PitchTransform(bin_range, bin_division), magnitudes=magnitudes) File "C:\Users\jackb\AppData\Roaming\Python\Python36\site-packages\tfr\reassignment.py", line 144, in reassigned self.signal_frames.sample_rate) File "C:\Users\jackb\AppData\Roaming\Python\Python36\site-packages\tfr\reassignment.py", line 50, in transform_freqs output_bin_count = (self.bin_range[1] - self.bin_range[0]) * self.bin_division TypeError: 'int' object is not subscriptable

I checked reassignment.py in tfr, and I guess that problem could be in line 35 def __init__(self, bin_range = (-48, 67), bin_division=1, tuning=Tuning()) because print(type(bin_range)) gives me <class 'int'> , but it's initialised as tuple...

P.S. I'm running this on Windows, but I don't think that this is the root of evil.

bzamecnik commented 6 years ago

Yeah, it should be tuple. It's the range of pitches that should be included in the range of the pitchgram. The question is where it was assigned an integer :)

bzamecnik commented 6 years ago

Oh, when I ran it with default args it failed be the same way. I'd guess it's some error from refactoring in the tfr library or similar. Sorry for not testing that.

bzamecnik$ python extract_features.py data/working/some-chords/ tmp-chord-features
loading dataset from: data/working/some-chords/
dataset shape: (10, 88200)
Traceback (most recent call last):
  File "extract_features.py", line 67, in <module>
    args.block_size, args.hop_size, args.bin_range, args.bin_division)
  File "extract_features.py", line 35, in extract_pitchgrams
    for i in range(len(dataset.samples)))
  File "/Users/bzamecnik/anaconda/lib/python3.4/site-packages/numpy/lib/shape_base.py", line 421, in dstack
    return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
  File "/Users/bzamecnik/anaconda/lib/python3.4/site-packages/numpy/lib/shape_base.py", line 421, in <listcomp>
    return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
  File "extract_features.py", line 35, in <genexpr>
    for i in range(len(dataset.samples)))
  File "/Users/bzamecnik/odrive/dropbox/Documents/dev/repos/tfr/tfr/sklearn.py", line 32, in transform
    bin_division=self.bin_division)
  File "/Users/bzamecnik/odrive/dropbox/Documents/dev/repos/tfr/tfr/reassignment.py", line 298, in pitchgram
    output_frame_size, PitchTransform(bin_range, bin_division), magnitudes=magnitudes)
  File "/Users/bzamecnik/odrive/dropbox/Documents/dev/repos/tfr/tfr/reassignment.py", line 143, in reassigned
    self.signal_frames.sample_rate)
  File "/Users/bzamecnik/odrive/dropbox/Documents/dev/repos/tfr/tfr/reassignment.py", line 49, in transform_freqs
    output_bin_count = (self.bin_range[1] - self.bin_range[0]) * self.bin_division
bzamecnik commented 6 years ago

So there's wrong order of params to PitchTransfomer:

"bin_division": 1, "bin_range": 1, "frame_size": 4096, "hop_size": 2048, "sample_rate": 44100, "output_frame_size": [-48, 67]}

🤦‍♂️

bzamecnik commented 6 years ago

output_frame_size was added to the middle of the arguments list and when it was call with positional list of args which was not updated, it lead to a mess.

PitchgramTransformer(sample_rate=44100, frame_size=4096, hop_size=2048, bin_range=list, bin_division=1)
->
PitchgramTransformer(sample_rate=44100, frame_size=4096, hop_size=2048, output_frame_size=None, bin_range=list, bin_division=1)

Lesson learned:

bzamecnik commented 6 years ago

Fixed. Seems to work now. Thanks for pointing that out!