noahzn / Lite-Mono

[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
MIT License
540 stars 61 forks source link

About the ImageNet pre-training model #31

Closed ZC0015wqj closed 1 year ago

ZC0015wqj commented 1 year ago

Hello, thank you for your help to me before, but I still want to ask how to pre-train on the ImageNet-1K dataset. It is the same as the kitti_data dataset training method, modify the file path in eigen_zhou, and then follow Is it ok to train with the command given. I have downloaded the ImageNet-1K dataset now, thank you very much for your help

noahzn commented 1 year ago

Hi,

No, this repo does not contain the code to pre-train a model on ImageNet. Please prepare your own code to pre-train the encoder. PyTorch official has a minimal example code. Please feel free to have a look at it.

ZC0015wqj commented 1 year ago

Thank you very much for your help, may I take the liberty to ask you, can you share this part of the training code

noahzn commented 1 year ago

Hi, the pre-training code needs some cleaning up, but I have no time to do it. I am not going to include it in this repo as it is not that relevant. I can give you the code in a few days if you send me an email.

ZC0015wqj commented 1 year ago

Thank you very much, can you give me your email

ZC0015wqj commented 1 year ago

I sent an email, I'm not sure if you received it

noahzn commented 1 year ago

Yes, I received your email. I will send you the code in a few days.

ZC0015wqj commented 1 year ago

Thank you very much!!!!!!!!!!

ZC0015wqj commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-training model?

noahzn commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-training model?

If you do not want to use the ImageNet pre-training you can simply not set --mypretrain.

ZC0015wqj commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-trained model? At the same time, I still have a problem. When I use the command 'tensorboard --log_dir ./tmp/mytrain' to check the log, I found that in a certain part, it has achieved higher accuracy and lower accuracy than the final training result. Error, but when I evaluate all the generated weights, I find that the accuracy of these models is far worse than what I see in the log, can you please tell me why?

ZC0015wqj commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-training model?

If you do not want to use the ImageNet pre-training you can simply not set --mypretrain.

This means that I don't need to modify other parameters, just don't add this option, right?

noahzn commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-training model?

If you do not want to use the ImageNet pre-training you can simply not set --mypretrain.

This means that I don't need to modify other parameters, just don't add this option, right?

yes.

noahzn commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-trained model? At the same time, I still have a problem. When I use the command 'tensorboard --log_dir ./tmp/mytrain' to check the log, I found that in a certain part, it has achieved higher accuracy and lower accuracy than the final training result. Error, but when I evaluate all the generated weights, I find that the accuracy of these models is far worse than what I see in the log, can you please tell me why?

Please see the comments here.

ZC0015wqj commented 1 year ago

Hello, how can I set the parameters to train the optimal model without the ImageNet pre-trained model? At the same time, I still have a problem. When I use the command 'tensorboard --log_dir ./tmp/mytrain' to check the log, I found that in a certain part, it has achieved higher accuracy and lower accuracy than the final training result. Error, but when I evaluate all the generated weights, I find that the accuracy of these models is far worse than what I see in the log, can you please tell me why?

Please see the comments here.

Well, when I found out that there is a big difference between them, I thought I could get a good result, now my result is not very ideal

noahzn commented 1 year ago

"The final training result", do you mean the last epoch? If so, please check the results of all the epochs, you might get the best result at earlier epoch.

ZC0015wqj commented 1 year ago

Yes, it's strange that when I checked all epochs, I didn't find a better value

noahzn commented 1 year ago

I sent an email, I'm not sure if you received it

I have sent you the code.

ZC0015wqj commented 1 year ago

I sent an email, I'm not sure if you received it

I have sent you the code.

Thank you very much!

ZC0015wqj commented 1 year ago

Could you please provide your complete training code for reference? When I used the code you sent to the email for training, I encountered a problem similar to the one in the picture. I don’t know how to solve it, so I would like to take the liberty of asking a copy of your complete training code is ok 1687337684521

noahzn commented 1 year ago

Hi, the output of the encoder should be a single tensor (the classification result). I already gave you a hint in the code return self.norm(x.mean([-2, -1])) # Global average pooling, (N, C, H, W) -> (N, C) please check if the dimension of your output is NxC.

ZC0015wqj commented 1 year ago

can you help me solve this problem Traceback (most recent call last): File "nain.py", line 502, in main(args) File "main.py" , line 412, in mainmodel_flops = flops.total() File " /hone/pc405/anaconda3/envs/nd/lb/python3.8/site-packages/fvcore/n/jit_analysis.py",line 248,in totalstats = self._analyze() File " /hone/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/fvcore/n/jit_analysis.py", line 551,in analyzegraph = _get_scoped_trace_graph(self._nodel, self._inputs, self._aliases) File "/hone/pc4O5/ anaconda3/envs /nd/l.b/python3.8/stite-packages/fvcore/n/jit_analysis.py",line 176,in _get scoped_trace graphgraph,- =_get_trace_graph( module, inputs) File "/hone/pc405/anaconda3/envs/nd/l.b/python3.8/site-packages/torch/jit/l_trace.py",line 1175, in _get_trace_graphouts ' : ONWXTracedNodule(f, strict,_force_outplace,return_inputs,_return_inputs_states)(*args,kwargs) File "/hone/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/torch/nn/nodules/Rodule. py", line 1130, in _call_implreturn forward_call( *input,kwargs) File " /hone/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/torch/jit/l_trace.py",line 127, in forwardgraph, out = torch._c._create_graph_by_tracing( File "/hone/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/torch/jit/_trace.py", line 118,in wrapperouts.append(self.inner(trace_inputs)) File "/hone/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/torch/n/nodules /nodule.py",line 1148, in _call_inplresult = forward call(input,kwargs) File "/hore/pc405/anaconda3/envs/nd/lib/python3.8/site-packages/torch/nn/nodules/nodule.py", line 1118,in _slow_forward result = self.forward(*input,kwargs) File " /hove/pc405/deeplearning/zc/lite123/lite-nono-pretrain-code (1)/nodels/depth_encoder .py",line 481,in forwardx = self.forward_features(x) File "/hone/pc405/deeplearning/zc/lite123/lite-nono-pretrain-code (1)/nodels/depth_encoder.py",line 467,in forward_featiresx =torch.cat(tmpx,dim=1) RuntineError: Tensors must have sane nunber of dimensions: got 4 and 2

noahzn commented 1 year ago

Hi, I'm not able to help you because you modified the code and I don't know what forward_featiresx is. It seems to be a simple feature channel misalignment problem.

ZC0015wqj commented 1 year ago

您好,我无法帮助您,因为您修改了代码,但我不知道是什么forward_featiresx。这似乎是一个简单的特征通道错位问题。

When I added depth_encoder to the lite_mono_pretrain code for training, it prompted me that the depth_encoder output is a 3D list type when calculating the loss, but the given target is a tensor type with a size of 128. May I ask how you unify the output of depth_encoder with the target?

noahzn commented 1 year ago

can you print the shapes of the output and the tensor before self.norm(x.mean([-2, -1])) of your depth_encoder?

ZC0015wqj commented 1 year ago

(N, C, H, W) -> (N, C)

I don't really understand that the output of the encoding layer should be a list structure as shown in the figure. We don't seem to be able to change its dimension directly through torch.mean (N, C, H, W) -> (N, C), do you need to convert the list to Tensor type?

noahzn commented 1 year ago

You should keep the output as a tensor, not a list. If the output of your network is x = (N,C,H,W), here the C is 1000 because you have 1000 classes. Then you use self.norm(x.mean([-2, -1])), and you can get a tensor of dimension (N, C). It's not a list!

ZC0015wqj commented 1 year ago

网络的输出不是三个不同阶段的特征图吗 1688385184142

您应该将输出保留为张量,而不是列表。如果网络的输出为x = (N,C,H,W),则此处 C 为 1000,因为您有 1000 个类。然后使用self.norm(x.mean([-2, -1])),就可以得到一个维度的张量(N, C)。这不是一个清单!

noahzn commented 1 year ago

不是的,当中阶段的特征是用于之后级联到decoder上面的,但是你这里是为了预训练一个分类encoder,因此你只应该输出最后一个结果。

ZC0015wqj commented 1 year ago

1688387289866 1688387274229 Can such two structures directly make losses? If we do it directly, we will encounter problems. Is this correct?

noahzn commented 1 year ago

No. You cannot compute the loss with nan.

ZC0015wqj commented 1 year ago

When the sample passes the model, its type becomes this

noahzn commented 1 year ago

There must be some errors in your network. My pre-training code has been shared with several people and they have no such problems.

ZC0015wqj commented 1 year ago

1688388560434 This is the shape when the encoding layer is output, but there is a problem during pre-training

ZC0015wqj commented 1 year ago

There must be some errors in your network. My pre-training code has been shared with several people and they have no such problems.

这个跟输入的H,W有关吗

noahzn commented 1 year ago

No. The main.py defines the image size for training. parser.add_argument('--input_size', default=256, type=int, help='image input size')

ZC0015wqj commented 1 year ago

否。它main.py定义了训练的图像大小。parser.add_argument('--input_size', default=256, type=int, help='image input size') When calculating theloss, the value of NAN appeared again, and now the output of the encoding layer is normal 1688392215956

noahzn commented 1 year ago

Do you have this layer after getting your x?
self.head = nn.Linear(dims[-2], num_classes) x = self.head(x)

ZC0015wqj commented 1 year ago

拿到手后有这一层吗xself.head = nn.Linear(dims[-2], num_classes) `x = self.head(x) Yes, this structure exists

noahzn commented 1 year ago

Have you solved this problem?

noahzn commented 1 year ago

I'm now closing this issue because I haven't heard back from you. Please reopen it or create a new one if you have further questions.