slwang9353 / MobileFormer

MobileFormer in torch
66 stars 13 forks source link

关于模型训练的问题 #4

Open Adr1anLove opened 2 years ago

Adr1anLove commented 2 years ago

您好作者,很感激您将代码与我们分享,我们在使用Food-101数据集对模型进行训练的时候,发现模型的每次输出都是预测第64类,而且输出的tensor都是相同的,loss也没有变化,模型并没有优化,我们只是接入了数据集没有修改网络,我们想知道是哪里出现了问题 (F4NW`FZEM8OM E(I) [](A{Y6)

slwang9353 commented 2 years ago

Hi, thanks for your feedback. I think there are two possible causes of this issue: (Major) For Food-101, mobile_former_96(10) in train.py 157 line should be mobile_former_96(101) since there are 101 classes, and mobile_former_96(10) is just set for cifar-10 in default example; (Minor) If the output remains the same, maybe ReLUs are dead. So maybe you should modify the initialization in model_generator for Food-101. Please let me know if this problem persists.

Adr1anLove commented 2 years ago

Thanks you for your reply. As for the two possibilities you mentioned, we changed the class_num to 101 when we intialized the network. But we have not considered the issuue of DyReLu before,  and we have modified the ReLus to replace DyReLu with ReLU6. We will give you feedback on the training result in time. In addition, we want to know if specific parameters need to be set in DyReLU for different dataset. We think the parameters of DyReLU have nothing to do datasets. ------------------ 原始邮件 ------------------ 发件人: "slwang9353/MobileFormer" @.>; 发送时间: 2021年11月5日(星期五) 下午3:32 @.>; @.**@.>; 主题: Re: [slwang9353/MobileFormer] 关于模型训练的问题 (Issue #4)

Hi, thanks for your feedback. I think there are two possible causes of this issue: (Major) For Food-101, mobile_former_96(10) in train.py 157 line should be mobile_former_96(101) since there are 101 classes, and mobile_former_96(10) is just set for cifar-10 in default example; (Minor) If the output remains the same, maybe ReLUs are dead. So maybe you should modify the initialization in model_generator for Food-101. Please let me know if this problem persists.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

slwang9353 commented 2 years ago

I think there is no need to set specific parameters for DyReLU. In my last reply, what I want to say is that, perhaps all ReLUs are dead, so the output is 0, so all the outputs are the same. But this shouldn’t happen. Have you add transform to normalize your dataset?

Adr1anLove commented 2 years ago

We have add transform to normalize the dataset after loading dataset, but we find that in check_accuray function(the first function in train.py), the elements of the y-matrix increase from 0, and all the elements in each y-matrix are equal. So we wondered if there was something wrong with loading the dataset.---- 回复的原邮件 @.>发送日期2021年11月05日 18:34 @.> @.**@.>主题Re: [slwang9353/MobileFormer] 关于模型训练的问题 (Issue #4)

I think there is no need to set specific parameters for DyReLU. In my last reply, what I want to say is that, perhaps all ReLUs are dead, so the output is 0, so all the outputs are the same. But this shouldn’t happen. Have you add transform to normalize your dataset?

—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.

[

{

@.***": "http://schema.org",

@.***": "EmailMessage",

"potentialAction": {

@.***": "ViewAction",

"target": "https://github.com/slwang9353/MobileFormer/issues/4#issuecomment-961785623",

"url": "https://github.com/slwang9353/MobileFormer/issues/4#issuecomment-961785623",

"name": "View Issue"

},

"description": "View this Issue on GitHub",

"publisher": {

@.***": "Organization",

"name": "GitHub",

"url": "https://github.com"

}

}

]

Adr1anLove commented 2 years ago
    And that’s what we got.It doesn't look like the network is optimized---- 回复的原邮件 ***@***.***>发送日期2021年11月05日 18:34 ***@***.***> ***@***.******@***.***>主题Re: [slwang9353/MobileFormer] 关于模型训练的问题 (Issue #4)

I think there is no need to set specific parameters for DyReLU. In my last reply, what I want to say is that, perhaps all ReLUs are dead, so the output is 0, so all the outputs are the same. But this shouldn’t happen. Have you add transform to normalize your dataset?

—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.

[

{

@.***": "http://schema.org",

@.***": "EmailMessage",

"potentialAction": {

@.***": "ViewAction",

"target": "https://github.com/slwang9353/MobileFormer/issues/4#issuecomment-961785623",

"url": "https://github.com/slwang9353/MobileFormer/issues/4#issuecomment-961785623",

"name": "View Issue"

},

"description": "View this Issue on GitHub",

"publisher": {

@.***": "Organization",

"name": "GitHub",

"url": "https://github.com"

}

}

]

Adr1anLove commented 2 years ago

We didn't make any changes to the model to exclude the dataset. The model to was trained on the server with CIFAR10, but the same problem occurred.

------------------ 原始邮件 ------------------ 发件人: "slwang9353/MobileFormer" @.>; 发送时间: 2021年11月5日(星期五) 晚上6:34 @.>; @.**@.>; 主题: Re: [slwang9353/MobileFormer] 关于模型训练的问题 (Issue #4)

I think there is no need to set specific parameters for DyReLU. In my last reply, what I want to say is that, perhaps all ReLUs are dead, so the output is 0, so all the outputs are the same. But this shouldn’t happen. Have you add transform to normalize your dataset?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.