fschmid56 / EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
MIT License
218 stars 41 forks source link

Why mobile output is so different with python? #26

Open thelou1s opened 6 months ago

thelou1s commented 6 months ago

Hi, I converted dymn10_as to pytorch mobile, But mobile output is so different with python. I checked both torch version and the model file. What may be the problem? Thanks :)

Expected Behavior: Same or similar outputs

Actual Behavior: So different output

Reproduce: 1.convert dymn10_as to pytorch mobile(android) 2.compare python and mobile(android) output

if __name__ == '__main__':
    model_name = 'dymn10_as'
    model_input = torch.rand(1, 1, 128, 210)
    ptmobile_name = 'eat_' + model_name + '_ptmobile.ptl'

    if model_name.startswith('dymn'):
        model = get_dymn(width_mult=NAME_TO_WIDTH(model_name), pretrained_name=model_name, strides=[2, 2, 2, 2])
    else:
        model = get_mn(width_mult=NAME_TO_WIDTH(model_name), pretrained_name=model_name, strides=[2, 2, 2, 2])
    model.to(torch.device('cpu'))
    model.eval()
    model = torch.jit.trace(model, model_input)
    print(model.code)

    # https://github.com/pytorch/pytorch/issues/96639
    # model = mobile_optimizer.optimize_for_mobile(model,
    #                                                  {
    #                                                      MobileOptimizerType.CONV_BN_FUSION,
    #                                                      # I'm only disabling CONV_BN_FUSION
    #                                                      # MobileOptimizerType.FUSE_ADD_RELU,
    #                                                      # MobileOptimizerType.HOIST_CONV_PACKED_PARAMS,
    #                                                      # MobileOptimizerType.INSERT_FOLD_PREPACK_OPS,
    #                                                      # MobileOptimizerType.REMOVE_DROPOUT
    #                                                  })
    model._save_for_lite_interpreter(ptmobile_name)
    print('android model save success!')

image

fschmid56 commented 6 months ago

Hi, is the problem specific to 'dymn10_as', or does it also hold for 'mn10_as'?

Do I understand correctly from the screenshot above that the values computed by the mel are slightly changed (but almost the same) and the values of the computed model outputs are very different on your PC and an Android device?

The model outputs on Android are very different, but not completely random, correct?

thelou1s commented 6 months ago

Hi, Thank you for quick reply and your great work : )

"is the problem specific to 'dymn10_as', or does it also hold for 'mn10_as'?" both dymn10_as and mn10_as

"very different on your PC and an Android device" Yes, Very different. You can notice that, python inference 'Snoring' correctly but android inference 'Music' on 'dd_01_16khz.wav'. And I also tested other 33 snoring wav files, python inference correctly half of them, but android none.

image

fschmid56 commented 6 months ago

I see. I haven't used the models on mobile phones before, so I will not be of much help here. Maybe I'll find the time in the near future to look into it.

However, I suggest closely examining the data type used on PC and Android. Maybe there is a mismatch in terms of precision?

Have you also checked whether the model weights stay the same?