cardwing / Codes-for-Lane-Detection

Learning Lightweight Lane Detection CNNs by Self Attention Distillation (ICCV 2019)
MIT License
1.04k stars 333 forks source link

About retrain result #23

Closed Endless-Hao closed 5 years ago

Endless-Hao commented 5 years ago

@cardwing hello, I am coming again. I have retrained your pre-train-model in CULane data. In the training process, I think the train is normal. Like accuracy is 70% to 80 %, accuracy-back is 90% to 98%. The times of train epoch is 90000. They are normal. But when I test them, I also get the not good result. I evaluate the results. Get the information below:

Evaluating the results... tp: 6783 FP: 96950 fn: 98103 finished process file precision: 0.065389 recall: 0.0646702 F-measure: 0.0650276

The result in real pictures are like below:

![Uploading 00900.results.png…]() ![Uploading 02430.results.png…]() 03240 results

Is there anything wrong in test code? I retrain it, the details are normal. But the test results were not good.

Endless-Hao commented 5 years ago

@cardwing In the retraining process, the loss is very normal.

Endless-Hao commented 5 years ago

@cardwing Also different test processes get different results using the same model. I really don't know what's wrong with it. I retrain normal.

cardwing commented 5 years ago

@Endless-Hao, the position mismatch problem still exists. Can you upload the full code you use to train and test? I will have a look at it and check which part is wrong.

Endless-Hao commented 5 years ago

@cardwing I upload the full code ,Thank you for your kind. Fullcode.zip

cardwing commented 5 years ago

@Endless-Hao, can you upload your trained model here so that I can have a test on my local server? I think the problem may exist in the evaluation process. However, according to your description, the training accuracy is only 80% while in my local server it can achieve 90%.

Endless-Hao commented 5 years ago

@cardwing Here is my trained model. At the last the accuracy can also achieve 90% or more, I just say they are in this interval. I upload in google cloud. https://drive.google.com/open?id=1fLxMfgEpLXQqCl2SENsuB3nub4JW_IVz

Endless-Hao commented 5 years ago

@cardwing I also get accuracy above 90% and get accuracy-back above 97% at last. The training process is ok and normal which is nearly the same as the trained picture you upload in another issue.

cardwing commented 5 years ago

@Endless-Hao, it is obvious that the problem exists in your evaluation process. The following figure is obtained via your uploaded model. It looks fine and the overall F1-measure is also normal. The following log is obtained via your model. Please carefully check your evaluation code (scripts provided by SCNN). Besides, you should train the model from the provided vgg.npy instead of the final testing model.

test_img

cardwing commented 5 years ago

The following two log files are obtained via your model.

vgg_SCNN_DULR_w9_iou0.5.txt

vgg_SCNN_DULR_w9_iou0.5_split.txt

Endless-Hao commented 5 years ago

@cardwing hello, I think my evaluation process is ok. Because I use your probability map, the evaluation is ok. The problem is my test result. They were so strange. You look at my test code. Does the testing code have some problems? I upload my probability map results by the test. https://drive.google.com/open?id=1n-zfSHwLOAsh9EGDKrLDvZ7dTL1n2aEB

Endless-Hao commented 5 years ago

@cardwing With this different test result, all I can think of is that some of the parameters are from the model, and some of them are randomly initialized, otherwise it is impossible.

cardwing commented 5 years ago

@Endless-Hao, your provided probability maps are indeed unsatisfactory. The testing code looks fine as it is exactly the same as mine. The mismatch problem can not be caused by the random initialization of some model parameters. What is the version of your tensorflow?

cardwing commented 5 years ago

And it is also weird that the code cannot be debugged in your pycharm.

Endless-Hao commented 5 years ago

@cardwing the version of my tensorflow is 1.10.1.

cardwing commented 5 years ago

@Endless-Hao, the version of my tensorflow is 1.3.0. Maybe some functions have been changed as tensorflow is updated. You need to check the problem by your self since I do not have any idea about what to do next.

cardwing commented 5 years ago

Another cause why you have different outputs is that BN parameters may not be fixed in the testing phase.

Endless-Hao commented 5 years ago

@cardwing I change another computer which the version of tensorflow is 1.7. And then, I get the good results. 05130 results 05250 results Maybe is the problem of some function in tensorflow. Thank you for your help very much.

cardwing commented 5 years ago

^_^

ktr-hubrt commented 5 years ago

^_^

first of all, thank for your amazing work. when i try to follow your work,during training, i found the training accuracy is up to 90%,so i want to know when the model is good enough?

cardwing commented 5 years ago

The training accuracy is fine. Just try the evaluation process and see if you can achieve similar performance.

ktr-hubrt commented 5 years ago

The training accuracy is fine. Just try the evaluation process and see if you can achieve similar performance.

3q so much

ktr-hubrt commented 5 years ago

The training accuracy is fine. Just try the evaluation process and see if you can achieve similar performance.

After the model i trained get the f1score 0.68, which is 3 point lower than your model, i wonder how you trained your model, the same parameters as the global_config?