Amshaker / SwiftFormer

[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
236 stars 25 forks source link

when distillation-type set none, the error occurs #11

Closed ThomasCai closed 8 months ago

ThomasCai commented 8 months ago

Newest, I push this pr https://github.com/Amshaker/SwiftFormer/pull/12/files please see it, thank you


Hi~ Thank you for your code, but I meet a problem. I run train code on ubuntu 18.04, when I put distillation-type set none, it happen this error. image

Looking forward to your reply, thank you~

python -m torch.distributed.launch --nproc_per_node=$nGPUs --use_env main.py --model SwiftFormer_L3 --data-path "$IMAGENET_PATH" \
--output_dir SwiftFormer_L3_results --distillation-type none
Amshaker commented 8 months ago

Hi @ThomasCai ,

Thanks for the pull request. Yes, it solves the issue when the distillation is set to none.

Best regards, Abdelrahman.

ThomasCai commented 8 months ago

Hi @ThomasCai ,

Thanks for the pull request. Yes, it solves the issue when the distillation is set to none. /

Best regards, Abdelrahman.

Hi @Amshaker ,

Thank you for the merge. That's all right, It's my pleasure.

By the way, I also want to ask a question. If I want to finetine the swiftformer_L3 on a 20-class data set, how should I adjust the parameters? Could you give me some suggestions please? Thank you.

Best regards ThomasCai