Closed henricombrink closed 2 weeks ago
Hi @henricombrink I'm sorry I missed the notification for this last week.
Thank you for your detailed report.
I am able to reproduce this locally.
At first glance, it looks to me like I introduced a bug in version 1.0.2, but I haven't quite got to the bottom of it yet.
For now, can you please use version 1.0.1?
By doing pip install vak==1.0.1
or conda install vak==1.0.1 -c conda-forge
in your environment.
I tested I was able to get the tutorial to work with 1.0.1 at least through the train step, and I am pretty sure from other users working through it that the rest should be working as well.
The only changes in version 1.0.2 are related to features I am adding to work with a benchmark datatset, so it will not impact how you work with your own data to use 1.0.1 instead of 1.0.2.
I have been working on related changes to vak and vocalpy. But I will figure out today or tomorrow the cause of this bug and I will put fixing it on the top of the to-do list.
Thank you for catching it, I am embarrassed I put out a version where the tutorial wasn't working :flushed: -- seems like I need to set up tests to catch that :thinking:
Thank you for the reply @NickleDave , much appreciated.
I installed uninstalled vak==1.0.2 and installed vak==1.0.1. The training scripts seems to be running now. If I encounter further problems I will report back here but for now the version change seems to have solved it.
Thank you for the help.
Great, glad to hear it, I will link back to the issue describing the bug here and keep you updated
Turns out gmail was sending my notifications for this repo to spam for some reason 😠 I will add it to the safelist
And @henricombrink please just let me know what else I can do to help.
Our software is mainly being used by neuro labs but the goal is for it to be more broadly useful
Looks like you're doing PAM work? (Did some Google stalking, hope that's ok.) I added you to the forum as well--please feel free to introduce yourself there if you have a chance.
Decided to not make a separate issue for this, instead just reworded the title to remind me what the source of the error is
We throw the KeyError here: https://github.com/vocalpy/vak/blob/7f8754c4b858a687da436348d9cc6bdcc81d78cc/src/vak/train/frame_classification.py#L424
Looks like this is the offending commit that introduced this bug: f40a3d420f5598dba61779b585b46af09d854184
What's happening here is we are setting up to call get_trainer
; specifically we need to determine whether there are multiple target types, and if so, we need to more precisely specify the accuracy we are going to monitor for early stopping. But this currently only matters for the BioSoundSegBench
dataset (soon to be named CMACBench
); for user prep'd datasets, we always use a single target, multi-class frame labels.
If I run the unit tests on train.frame_classification
, then I do trigger this bug with the very first unit test.
So this is really my fault for (1) not running tests locally before releasing, and (2) not having CI working to catch it either: I need to finish #736.
I think a quick fix is just to insert a check for the key before the logic that decides how many target types there are, like so:
if "target_type" in dataset_config["params"]:
if isinstance(dataset_config["params"]["target_type"], list) and all([isinstance(target_type, str) for target_type in dataset_config["params"]["target_type"]]):
multiple_targets = True
elif isinstance(dataset_config["params"]["target_type"], str):
multiple_targets = False
else:
raise ValueError(
f'Invalid value for dataset_config["params"]["target_type"]: {dataset_config["params"]["target_type"], list}'
)
else:
multiple_targets = False
I made this fix in a quick-and-dirty and all tests pass, so I will go ahead and release a bugfix version to close this issue.
Long term I need to think about how to organize all this, it feels very kludgy. The reason is that we are not really committed to providing the ability to specify different target types, so we don't have a designated way to declare them. E.g., we could have a default target type of "multi_frame_labels" and then more directly infer which value to monitor from there. I will raise a separate issue about that
@henricombrink I just published version 1.0.3 to pypi that should fix this. A conda-forge package should follow shortly.
Sorry again for releasing with a trivial bug and for not getting the notification, and thank you for reporting the bug!
@all-contributors please add @henricombrink for bug
@NickleDave
I've put up a pull request to add @henricombrink! :tada:
@NickleDave Thanks for the quick fix, much appreciated. I will let you know if I encounter further problems.
I am working on red squirrels vocalizations. I have used BirdNet to annotate squirrel rattles from a large collection of sound recordings. I am hoping to use VAK to annotate individual syllables within rattle sequences.
Sound very cool, can't wait to hear what you all are learning about red squirrel vocalizations when you're ready to share.
Please just let me know whatever else I can do to help.
Description I am currently trying to go thought the vak tutorial and have completed the vak prep step successfully. I have changed the parameters in the .toml files and the [vak.train.dataset] get updated in the .toml file after the prep step. However, I am getting the following error when I try to run vak train gy6or6_train.toml :
system Ubuntu 22.04.5 LTS vak 1.0.2