OSU-Bee-Lab / buzzdetect

a machine learning tool to detect and classify bee buzzes in audio
GNU General Public License v3.0
4 stars 1 forks source link

analyze_mp3_in_place gives error when one chunk covers entire file #6

Closed LukeHearon closed 12 months ago

LukeHearon commented 1 year ago

Traceback:

Traceback (most recent call last):
  File "/home/luke/r projects/BuzzDetect/environment/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-712912f8d1e8>", line 1, in <module>
    analyze_mp3_batch(modelname = "fullwav", directory_in = "./audio_in", directory_out = "./output", frameLength = 1500, frameHop = 750, chunklength_hr = 0.25)
  File "<ipython-input-2-ad1f2c0febdd>", line 103, in analyze_mp3_batch
    analysis_df = analyze_mp3_in_place(model = model, classes=classes, mp3_in = file, frameLength = frameLength, frameHop = frameHop, chunklength_hr = chunklength_hr)
  File "<ipython-input-2-ad1f2c0febdd>", line 63, in analyze_mp3_in_place
    full_analysis = pd.concat(analysis_list)
  File "/home/luke/r projects/BuzzDetect/environment/lib/python3.9/site-packages/pandas/core/reshape/concat.py", line 380, in concat
    op = _Concatenator(
  File "/home/luke/r projects/BuzzDetect/environment/lib/python3.9/site-packages/pandas/core/reshape/concat.py", line 443, in __init__
    objs, keys = self._clean_keys_and_objs(objs, keys)
  File "/home/luke/r projects/BuzzDetect/environment/lib/python3.9/site-packages/pandas/core/reshape/concat.py", line 505, in _clean_keys_and_objs
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

My guess as to why this happens (untested atm): When chunk size > file duration, analyze_mp3_in_place makes an analysis_list of length 1 with a single pd.DataFrame. This makes concat angry. I would rather it just returns the single DataFrame, but I'm having trouble implementing that at the moment (what's the equivalent of unlist()?

LukeHearon commented 12 months ago

Resolved by writing results per-chunk (a change I made for other reasons, but this is a nice side effect!)