TIPS to improve the model training?

janclemenslab / das

Deep Audio Segmenter

http://janclemenslab.org/das/

Apache License 2.0

28 stars 10 forks source link

TIPS to improve the model training? #16

Closed hkoda closed 2 years ago

hkoda commented 2 years ago

Hi, we could installed the DAS on my Ubuntu server and trained our data to detect the marmoset calls which were recorded in the sound chamber room. We got a great results but I wonder if I improve the prediction much more by adding more data or tuning some parameters. Now we used the 1680 min including 5240 call annotations, as the training data set. The trained model predicted the call region correctly for the new data set (good performance for hit), but the model failed to recognize the “pulse-like” noises as a call regions (frequent false alarms).

Our data were obtained in the sound chamber, meaning the S/N ratios were relatively higher.

Do you have any suggestions or TPIS to refine the model?

Change parameter settings?
Adding more data?
Additional annotation (now we input the "call region annotation", but we should include noise region as "noise" annotation?)
Data augumantaion techniques?

postpop commented 2 years ago

Hi, happy to hear that it works for you! The issue with the short pulsatile noises may be easy to solve if these noises are much shorter than any of your calls, because you can filter them out. In the predict dialog (see docs), under "Segment detection" enable and set "Delete segments short than" to be shorter than the duration of the calls and longer than the duration of the noises. They will then be removed during prediction.

hkoda commented 2 years ago

Thanks quick reply. OK, I will do to set the "threshold" option first, but actually I think that it may not work: -- the model may predict such noise as "trill" of the marmoset calls, which is short pulse-like sound. In that case (i.e., tuning the model to discrimite trill from the short-noise), what do you suggest us for next?

postpop commented 2 years ago

I see - then it probably won't work. Change parameter settings: If you use the parameters from the paper, then I do not think there's much to do. You could try doubling the chunk duration; if the model sees more of the signal at once, it may be able to learn to discriminate the noise.

Adding more data/additional annotation: 5420 call sounds like a lot. But if you have recordings with lots of the pulsatile noises, then adding them may help. An alternative may be to annotate the pulsatile noise as a "noise call".

Data augmentation: We are working on adding this to DAS but that will take a while. One way to "augment" your data is to artificially enrich it with examples of the pulsatile noise. That is, create new versions of your recordings with noise calls added where there is no signal.

hkoda commented 2 years ago

Many thanks for suggeting us quickly. Okey - we first will do:

doubleing the chunk duration
adding more data, which mainly includes "nosie" but dose not include "call", for the model to learn to ignor the noisy region appropriately, In that case, do you think that we should "annotate" the noise region, or we should not do. We will first add the noise data with no annotation for noise (for convenience of the preparartion).
pallaelly preparing the technique of augmantation as you suggest! I will report here if we could get the results again.

yHamazaki84 commented 2 years ago

Hi, I am trying training,prediction in das. I would like to ask you a question. I was unable to get the result.h5 file when I performed training with the das.train function in python. Test data has been prepared in step of making dataset. Could you please tell me how I could do this?