Audio-WestlakeU / ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
MIT License
80 stars 11 forks source link

training results not very good #15

Closed Kiri0824 closed 2 months ago

Kiri0824 commented 2 months ago

Hi,i think it's an amazing job, but i trained stage1 with all the default parameters, stage2 just change:

training:
  #batch size: [synth, weak, unlabel]
  # batch_size: [24, 24, 48, 48]  # if you want to change the batch size, be careful of the total training steps.
  batch_size: [12, 12, 24, 24]
  batch_size_val: 64
  const_max: 70 # max weight used for self supervised loss
  n_epochs_warmup: 5 # num epochs used for exponential warmup
  num_workers: 6 # change according to your cpu
  n_epochs: 125 # max num epochs

but i don't know why my psds1 score very low,stage1 and stage2 both about 0.21,psds2 very normal here's stage 1 output after training:

           Test metric                       DataLoader 0
─────────────────────────────────
            hp_metric                     0.7555411331847474
   test/student/event_f1_macro            0.234224334359169
test/student/intersection_f1_macro        0.6208451326131922
     test/student/loss_strong             0.1191956102848053
   test/student/psds1_psds_eval          0.20791560076960455
test/student/psds1_sed_scores_eval       0.22486176405258554
   test/student/psds2_psds_eval           0.7555411331847474
test/student/psds2_sed_scores_eval        0.7806354678518701
   test/teacher/event_f1_macro           0.24510306119918823
test/teacher/intersection_f1_macro        0.6328589041610267
     test/teacher/loss_strong             0.1165553405880928
   test/teacher/psds1_psds_eval           0.2229318559439836
test/teacher/psds1_sed_scores_eval       0.24237753229233633
   test/teacher/psds2_psds_eval           0.7516268271041807
test/teacher/psds2_sed_scores_eval        0.779865513360209

here's stage2 output after training:

           Test metric                       DataLoader 0
──────────────────────────────────
            hp_metric                     0.8067842239130404
   test/student/event_f1_macro           0.33283093571662903
test/student/intersection_f1_macro        0.6977278050929052
     test/student/loss_strong             0.1500978022813797
   test/student/psds1_psds_eval          0.21488627601697868
test/student/psds1_sed_scores_eval       0.22915346776891088
   test/student/psds2_psds_eval           0.8067842239130404
test/student/psds2_sed_scores_eval        0.833245027144476
   test/teacher/event_f1_macro            0.3343150317668915
test/teacher/intersection_f1_macro        0.6997189199832828
     test/teacher/loss_strong            0.15349356830120087
   test/teacher/psds1_psds_eval          0.21469927694665641
test/teacher/psds1_sed_scores_eval        0.2288913841216405
   test/teacher/psds2_psds_eval           0.806485435209259
test/teacher/psds2_sed_scores_eval        0.8365682650322246

also i try using: stage1.ckpt provided by this github, and i trained stage2, psds1 still just 0.2, psds2 was normal, anyone know why or its normal? i think it's not normal..

SaoYear commented 2 months ago

Hi, thanks for the interest.

According to the results you provided, it seems that your evaluation is abnormal.

Could you please use the code and checkponit from the DCASE repo, and check if your test results are consistent with the provided ones in the DCASE repo?

You might follow these commands: image

Kiri0824 commented 2 months ago

Hi, thanks for the interest.

According to the results you provided, it seems that your evaluation is abnormal.

Could you please use the code and checkponit from the DCASE repo, and check if your test results are consistent with the provided ones in the DCASE repo?

You might follow these commands: image

i knew why its not very good. i didn't turn 44000 sample rate dataset to 16000 sample rate. now its seems normal. i'm working on reproducing all the results. THANK U FOR YOUR REPLY AGAIN! ;)