Closed kazuki-can closed 3 years ago
Hi, Thanks for your interest in our work. I might have encountered a similar issue before, but could not remember the details :-( However, all should work just fine as long as you follow the guide in the Environment section. Are you using python 3.7 with tensorflow 2.2.1 on a Linux machine? All the best. Xingyang Ni
Thanks for quick reply. I followed the installation section except for cudnn since my computer does not have it. And I am using python 3.7 with tensorflow 2.2.1 on windows machine.
Could you provide the error log? This issue is probably related to Windows. Xingyang
Traceback (most recent call last):
File "
It runs till "Freeze layers in the backbone model for 20 epochs. ~~ I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll"
But after this, it says like above.
And when I run the commmand with python3 , it just says Python. So I run it with python -u solution.py --dataset_name "Market1501" --backbone_model_name "ResNet50"
This issue is actually the same as https://github.com/nixingyang/AdaptiveL2Regularization/issues/4. Try appending --workers 1
to the command. If it still does not work, use a Linux machine instead.
Thank you so much. Its running correctly now. I hope it will finish training.
Fortunately, it finished first 20 epochs. But after that, it ran again for some reasons. What will I see when it's been trained? In output_2020_12_22\Market1501_384x128\ResNet50_16_4, I can see training_A and I can see pkl file and many png files in it.
This is expected. Wait until the process completes.
2020-12-22 22:28:19.472202: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at strided_slice_op.cc:138 : Resource exhausted: OOM when allocating tensor with shape[64,12,8,2048] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "solution.py", line 1070, in
Function call stack: train_function
2020-12-22 22:28:20.011380: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated. [[{{node PyFunc}}]]
This is the error I am facing after re-running. I'm sorry for making you busy because of me and also thank you so much for helping me a lot.
No problem.
This is an out of memory issue. You could use smaller images by specifying the image_width
and image_height
flags, or use a GPU with large memory.
How should I specify the image_width and image_height flags?
Use something like --image_width 64 --image_height 192
.
Thank you very much. I'm so glad that you such a considerate person have made this model. I downloaded pre-trained model and tried evaluation. It says 'All done' , so I think it finished successfully,but I want to train it myself so I will try it. And thank you for understanding my poor English.
You are welcome. Feel free to ask if you have any other questions.
How does this model extract the features of each person ? From body parts or whole appearance or like height?
Hi, we are using one global branch and two regional branches. Features from each branch are concatenated in the inference procedure. You may find more details in the "B. Baseline" section.
Thank you so much for explaining.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Closing as stale. Please reopen if you'd like to work on this further.
Hi , I'm new to this model. Since this is super interesting, I want to see how it works. But, I am facing an error saying "AttributeError: Can't pickle local object 'init_resnet..'".
How can I sort it out?