vocalpy / vak

A neural network framework for researchers studying acoustic communication
https://vak.readthedocs.io
BSD 3-Clause "New" or "Revised" License
78 stars 16 forks source link

BUG: predict raises KeyError when config file is missing `[PREDICT.transform_kwargs]` with `window_size` option #725

Closed wendtalexander closed 1 year ago

wendtalexander commented 1 year ago

Describe the bug Im getting a KeyError in /vak/transforms/defaults/frame_classification.py in the function get_default_frame_classification_transform()

line 260 get_default_frame_classification_transform
    window_size=transform_kwargs["window_size"],
                ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'window_size'

As far I can tell is window_size never added to the transform_kwargs in the /vak/predict/frame_classification.py starting-line 149

I added a pseudo code section of what I think is missing!

  if transform_params is None:
      transform_params = {}
  transform_params.update({"spect_standardizer": spect_standardizer})

  #----- adding pseudo code ------#
  transform_params.update(window_size)
  #--------------------------------#

  item_transform = transforms.defaults.get_default_transform(
      model_name, "predict", transform_params
  )

To Reproduce Prediction without omitting transform_kwargs in the toml file.

vak version is 1.0.0a3 Operation System is WSL2 Ubuntu

NickleDave commented 1 year ago

Hi @wendtalexander!
Sorry you're running into this error, and thank you for raising a detailed bug report.

I'm not quite clear on what's causing it, though.

You suggest to replicate the bug by running

Prediction without omitting transform_kwargs in the toml file.

but when I run in a development environment using a test config that includes [PREDICT.transform_params], I do not get a bug. The transform_params table in that config looks like this:

[PREDICT.transform_params]
window_size = 176

Can you please

vak predict tests/data_for_tests/generated/configs/TweetyNet_predict_audio_cbin_annot_notmat.toml

(That last one is what I'm doing to test whether I get the same bug.)

I have a feeling the issue here is that there's something we haven't made clear in the docs, and/or we need to raise a clearer error message

wendtalexander commented 1 year ago

Sorry that I cant test this properly myself, I am kinda new checking large Projects.

Can you @NickleDave do me a favor and check if the prediction runs without errors, if you would delete the PREDICT.transform_params? Or are they mandatory in the new Version of vak?

In the code these transform_params are always optional (as far as I can follow), and getting calculated / assigned in this file /vak/predict/frame_classification.py

In my toml file i did not write any [PREDICT.transform_params].

NickleDave commented 1 year ago

Sorry that I cant test this properly myself, I am kinda new checking large Projects.

No worries, I didn't mean to overwhelm you. Maybe we can help you learn how later 😄

Can you do me a favor and check if the prediction runs without errors, if you would delete the PREDICT.transform_params?Or are they mandatory in the new Version of vak?

Ah ok I think I understand what's going on.

Yes, it is mandatory to include [PREDICT.transform_params] with a window_size option for TweetyNet, and any other frame classification model. You will want to use the same window size you used during training. This replaces the window_size option that uses to live in the [DATALOADER] table.

When you generate predictions for new data from a frame classification model, vak turns that data into batches of consecutive, non-overlapping windows. So you need to tell it what window size to use.

I'm sorry, I know the changes in config file format are a bit confusing. We will end up rewriting the config file format as we move towards a final release of 1.0 (as discussed in #345). The goal is to make it possible to specify different options for different model families. But for right now the config file format will be in flux with the alpha version. Apologies for the bumpy ride!

We did add an example config file to the docs that shows this table (although I just realized the link to it in the tutorial is broken 🤦 ) https://github.com/vocalpy/vak/blob/main/doc/toml/gy6or6_predict.toml

Please let me know if adding that required transform_params lets you run predict as expected.

wendtalexander commented 1 year ago

Yes, it is mandatory to include [PREDICT.transform_params] with a window_size option for TweetyNet

Thats what is whats missing in my toml file.

The goal is to make it possible to specify different options for different model families. But for right now the config file format will be in flux with the alpha version. Apologies for the bumpy ride!

all good! Thank you for the fast and helpful responses!

NickleDave commented 1 year ago

Sure thing, thanks @wendtalexander -- sorry again about the confusion. It is helpful to know what is and isn't clear to people that haven't been starting at the code for :sob: years :smirk_cat:

I'll go ahead and close this but please do feel free to raise issues if you think you run into more bugs. You could also ask questions in our forum here: https://forum.vocalpy.org/