dscripka / openWakeWord

An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
Apache License 2.0
628 stars 59 forks source link

Can't generate negative samples/clips for test or train #192

Open w-marco opened 1 month ago

w-marco commented 1 month ago

I am using the automatic model training notebook and it works up until where the negative clips should be generated. Whereas with positive samples it logs smth. like:

INFO:generate_samples:Successfully loaded the model
DEBUG:generate_samples:Batch 1/10 complete
DEBUG:generate_samples:Batch 2/10 complete
DEBUG:generate_samples:Batch 3/10 complete
DEBUG:generate_samples:Batch 4/10 complete
DEBUG:generate_samples:Batch 5/10 complete
DEBUG:generate_samples:Batch 6/10 complete
DEBUG:generate_samples:Batch 7/10 complete
DEBUG:generate_samples:Batch 8/10 complete
DEBUG:generate_samples:Batch 9/10 complete
DEBUG:generate_samples:Batch 10/10 complete
INFO:generate_samples:Done

with the negative clips it will only log:

INFO:generate_samples:Successfully loaded the model
INFO:generate_samples:Done

Of course nothing is generated and due to the files being missing, the next step where the clips should get augmented naturally fails. I tried running the steps in Colab and locally on Linux and the error remains the same.

Any ideas why it just refuses to generate negative clips?

w-marco commented 1 month ago

I think I've found the issue:

It comes from this line in train.py:

                         batch_size=config["tts_batch_size"]//7,

Removing the //7 does work and fixes the issue. I'm not sure why this doesn't work, but removing it fixed my reported bug.

dscripka commented 2 weeks ago

That is a little odd; are you generating a fairly small number of samples? The //7 is mostly there for GPUs so that too much VRAM is not used to negative clip generation, but you raise a good point that this isn't needed for CPU-only training.

w-marco commented 1 week ago

No, I tried generating like 50k samples, so I am not sure. It seemed like some Python bug, as other files with the // divider also had issues until I removed those.