FishMaster93 / U-FFIA

The audio-visual fusion method for FFIA
8 stars 0 forks source link

Reproducing the results #4

Open chandlerbing65nm opened 1 month ago

chandlerbing65nm commented 1 month ago

Hi again. I'm trying to reproduce the results in Table 1 from the paper. What configs in the yml files should I use to reproduce the Pre-MobileNetV2, Pre-CNN10, and Pre-CNN6 results from Table 1.

FishMaster93 commented 1 month ago

you can use the config/audio/exp1.yaml reproduce the Pre-MobileNetV2, Pre-CNN10, and Pre-CNN6 results from Table 1.

chandlerbing65nm commented 1 month ago

@FishMaster93 I'm a bit confused by the configs and code structure. For table 1 result of Pre-CNN10 and Pre-MobileNetV2, what should I set as:

Model: ?
Model_pretrained: ?

and in the pretrained files, what files should I choose for preloading the weights? I've tried mutiple files here conrresponding the the model and pre-model, but it doesn't work.

FishMaster93 commented 1 month ago

I have changed the code and given you an example of how to use the pre-trained model Cnn10, the config/audio/pre_exp.yam is the pre-trained Cnn10, and in the main.py look at the "model = AudioModel_pre_Cnn10(frontend=frontend, backbone=backbone)" this is a pre-trained function, you can find it in the models/Audio_model. now you can run it, but I cannot change all code for you, you may read those code and try to understand it.

chandlerbing65nm commented 1 month ago

@FishMaster93 Thanks for the help. I have run the training for the config you gave but the result is far from the published result in Table 1 for 'Pre-CNN10'.

I've got this test result after training: '2024-07-20 13:09:52,597 - INFO - test_dataset mAP: 0.7015687491004299, accuracy: 0.7271428571428571'

I've also attached the log file for the training for full reference: Cnn10-Pretrained-1721464765.log

FishMaster93 commented 1 month ago

It is due to your dataset not being enough, the training dataset should be 213000, not 3200. I don't know what happened. I have updated the code. This is the result of my new training of 80 epochs and accuracy is 0.83. It is not fully trained yet, and it can be improved Pretrained-Cnn10-1721506338.log . You should check whether the dataset you downloaded is enough for 27067 clips.