Closed xiaonvxia closed 4 years ago
Hi, I cannot reproduce this issue.
Can you please paste all the log shown in the screen?
Can you please try to use my original code, without any personal changes?
Thank you for your reply! I did run your source code without any changes,the final result of the last run is as follows: Extract 76 batch videos Mean AP: 8.2% CMC Scores: top-1 20.2% top-5 32.2% top-10 38.8% top-20 46.2% create dataloader for Test with batch_size 256 2%|██▎ | 1/48 [00:01<01:04, 1.37s/i 4%|████▋ | 2/48 [00:02<01:03, 1.37s 6%|███████ | 3/48 [00:04<01:02, 1.3 8%|█████████▎ | 4/48 [00:05<01:01, 10%|███████████▋ | 5/48 [00:07<01:00, 12%|██████████████ | 6/48 [00:08<01:0 15%|████████████████▎ | 7/48 [00:09<0 17%|██████████████████▋ | 8/48 [00:11 19%|█████████████████████ | 9/48 [00: 21%|███████████████████████▏ | 10/48 [ 23%|█████████████████████████▍ | 11/48 25%|███████████████████████████▊ | 12/ 27%|██████████████████████████████ | 1 29%|████████████████████████████████▍ 31%|██████████████████████████████████▋ 33%|█████████████████████████████████████ 35%|███████████████████████████████████████▎ 38%|█████████████████████████████████████████▋ 40%|███████████████████████████████████████████▉ 42%|██████████████████████████████████████████████▎ 44%|████████████████████████████████████████████████▌ 46%|██████████████████████████████████████████████████▉ 48%|█████████████████████████████████████████████████████▏ 50%|███████████████████████████████████████████████████████▌ 52%|█████████████████████████████████████████████████████████▊ 54%|████████████████████████████████████████████████████████████ 56%|██████████████████████████████████████████████████████████████▍ 58%|████████████████████████████████████████████████████████████████▊ 60%|███████████████████████████████████████████████████████████████████ 62%|█████████████████████████████████████████████████████████████████████▍ 65%|███████████████████████████████████████████████████████████████████████▋ 67%|████████████████████████████████████████████████████████████████████████ 69%|████████████████████████████████████████████████████████████████████████ 71%|████████████████████████████████████████████████████████████████████████ 73%|████████████████████████████████████████████████████████████████████████ 75%|████████████████████████████████████████████████████████████████████████ 77%|████████████████████████████████████████████████████████████████████████ 79%|████████████████████████████████████████████████████████████████████████ 81%|████████████████████████████████████████████████████████████████████████ 83%|████████████████████████████████████████████████████████████████████████ 85%|████████████████████████████████████████████████████████████████████████ 88%|████████████████████████████████████████████████████████████████████████ 90%|████████████████████████████████████████████████████████████████████████ 92%|████████████████████████████████████████████████████████████████████████ 94%|████████████████████████████████████████████████████████████████████████ 96%|████████████████████████████████████████████████████████████████████████ 98%|████████████████████████████████████████████████████████████████████████ 100%|████████████████████████████████████████████████████████████████████████ 100%|████████████████████████████████████████████████████████████████████████ ███████████████████████████████████████| 48/48 [01:06<00:00, 1.39s/it] Extract 48 batch videos create dataloader for Test with batch_size 256 33%|█████████████████████████████████████▋ 67%|████████████████████████████████████████████████████████████████████████ 100%|████████████████████████████████████████████████████████████████████████ 100%|████████████████████████████████████████████████████████████████████████ █████████████████████████████████████████| 3/3 [00:04<00:00, 1.40s/it] Extract 3 batch videos u_features (12185, 2048) l_features (751, 2048) Label predictions on all the unlabeled data: 2579 of 12185 is correct, accuracy = 0.212 selected pseudo-labeled data: 2513 of 10966 is correct, accuracy: 0.2292 new train data: 11717 Unselected Data:1219
Could you upload the full log.txt file?
Because the result of the code was so long, the terminal only kept the end of the log records, I copied all the logs that were displayed. log.txt
Hi, my code kept the log record in the log dir. Please find it and upload here.
Since the framework is in a progressive way, I should see the history log to check and find the reason.
Is the files in the logs folder?I'll upload here
This line in your run file sys.stdout = Logger(osp.join(args.logsdir, 'log'+ str(args.EF)+ time.strftime(".%m%d_%H:%M:%S") + '.txt')) Is this line of code used to keep a log of the results of the run?Because there was an error in this line of code when I ran the code, I logged it out. Now there is only .ckpt file in my logs folder
Yes, you're correct. This is exact the line that stores the log information to a file.
I cannot find the reason given the very few log information. Each training model is based on that of the previous one.
I'll run the program again, with this line of code added and reply to you.There is another problem: my code runs on CPU, but my computer has a single GPU. I don't know why the code doesn't run on GPU?
Please pay attention to the CUDA environment in your machines
I noticed your line of code model = nn.DataParallel(model).cuda() however, my computer is single GPU, does this affect the use of GPU?
model = nn.DataParallel(model).cuda() should be work even on a single GPU.
Could you please try the basic PyTorch examples and test whether it runs on GPU?
All right,I will have a try.Thank you for your patient explaination
When I was running the code today, I found that I had set the batch-size to 4, and the result I ran after I changed it to 16 reached the result in your paper.
Good to hear that you could reproduce our results.
The batch size is very important here, batch size =4 cannot lead to a stable Stochastic gradient descent training.
Indeed, I didn't expect that batchsize would have such an impact on the experimental results. So the bigger the batchsize, the better the result?
I use the following line to verify that pytorch is using GPU: torch.cuda.is_available() The returned result is true, but my GPU utilization rate is only 3%. Is it because most of the code operations are performed on the CPU, and GPU only undertakes a small amount of computing tasks?
Maybe not. The batch size is usually kept as 16 or 32 on the person re-id task. Too large batch size will definitely downgrade the performance, especially on the one-example training (which only has hundreds of data).
I use the following line to verify that pytorch is using GPU: torch.cuda.is_available() The returned result is true, but my GPU utilization rate is only 3%. Is it because most of the code operations are performed on the CPU, and GPU only undertakes a small amount of computing tasks?
In the training stage, the GPU utilization rate should be very high (close to 90%, maybe). In the test and pseudo labeling stage, there are lots of distance calculation. So at this stage, the most computation is done on CPU.
Thank you very much for your answer. I'll find out the reason for the low utilization rate on GPU
I'm sorry to bother you.I have run your code several times, mAP only reaches 8%, and top-1 only reaches 20% at most. Otherwise,the results are different when each time I run the code. How can I achieve the effect in your paper and keep the results at the same?