wentaozhu / DeepLung

WACV18 paper "DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification"
Apache License 2.0
312 stars 143 forks source link

Training Trend #137

Closed MHansy closed 3 years ago

MHansy commented 3 years ago

Kindly help me, through this chart of tpr and tnr, is it going well? Can I determine the best epoch by considering high tpr and tnr?

Epoch 001 (lr 0.01000) Train: tpr 13.84, tnr 51.40, total pos 1770, total neg 5056, time 746.86 loss 0.7987, classify loss 0.7111, regress loss 0.0114, 0.0258, 0.0227, 0.0276

Validation: tpr 34.83, tnr 69.01298144, total pos 178, total neg 7307431, time 47.87 loss 0.7513, classify loss 0.6926, regress loss 0.0153, 0.0127, 0.0121, 0.0186

Epoch 002 (lr 0.01000) Train: tpr 5.93, tnr 84.91, total pos 1770, total neg 5056, time 719.31 loss 0.7430, classify loss 0.6901, regress loss 0.0121, 0.0111, 0.0116, 0.0180

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308679, time 9.87 loss 0.7497, classify loss 0.6921, regress loss 0.0138, 0.0141, 0.0128, 0.0168

Epoch 003 (lr 0.01000) Train: tpr 4.86, tnr 86.75, total pos 1770, total neg 5056, time 692.07 loss 0.7461, classify loss 0.6920, regress loss 0.0119, 0.0123, 0.0115, 0.0184

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308263, time 15.65 loss 0.7512, classify loss 0.6932, regress loss 0.0100, 0.0152, 0.0147, 0.0181

Epoch 004 (lr 0.01000) Train: tpr 2.66, tnr 93.61, total pos 1770, total neg 5056, time 698.29 loss 0.7443, classify loss 0.6911, regress loss 0.0114, 0.0113, 0.0121, 0.0184

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7309098, time 13.20 loss 0.7533, classify loss 0.6930, regress loss 0.0130, 0.0151, 0.0129, 0.0194

Epoch 005 (lr 0.01000) Train: tpr 7.34, tnr 79.35, total pos 1770, total neg 5056, time 684.75 loss 0.7462, classify loss 0.6927, regress loss 0.0117, 0.0119, 0.0114, 0.0186

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308221, time 15.91 loss 0.7546, classify loss 0.6930, regress loss 0.0159, 0.0122, 0.0139, 0.0195

Epoch 006 (lr 0.01000) Train: tpr 5.31, tnr 84.12, total pos 1770, total neg 5056, time 697.27 loss 0.7434, classify loss 0.6911, regress loss 0.0117, 0.0110, 0.0111, 0.0185

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307858, time 13.49 loss 0.7525, classify loss 0.6933, regress loss 0.0137, 0.0135, 0.0135, 0.0185

Epoch 007 (lr 0.01000) Train: tpr 10.51, tnr 70.65, total pos 1770, total neg 5056, time 706.30 loss 0.7474, classify loss 0.6933, regress loss 0.0122, 0.0114, 0.0120, 0.0185

Validation: tpr 78.65, tnr 33.22544921, total pos 178, total neg 7307841, time 16.10 loss 0.7459, classify loss 0.6929, regress loss 0.0130, 0.0105, 0.0114, 0.0181

Epoch 008 (lr 0.01000) Train: tpr 8.08, tnr 77.83, total pos 1770, total neg 5056, time 698.31 loss 0.7458, classify loss 0.6919, regress loss 0.0111, 0.0120, 0.0120, 0.0189

Validation: tpr 34.83, tnr 66.39621470, total pos 178, total neg 7307644, time 13.23 loss 0.7489, classify loss 0.6932, regress loss 0.0135, 0.0120, 0.0127, 0.0176

Epoch 009 (lr 0.01000) Train: tpr 2.88, tnr 91.73, total pos 1770, total neg 5056, time 699.58 loss 0.7447, classify loss 0.6910, regress loss 0.0115, 0.0112, 0.0117, 0.0193

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307948, time 16.12 loss 0.7519, classify loss 0.6930, regress loss 0.0117, 0.0158, 0.0138, 0.0177

Epoch 010 (lr 0.01000) Train: tpr 6.05, tnr 82.89, total pos 1770, total neg 5056, time 690.41 loss 0.7444, classify loss 0.6909, regress loss 0.0113, 0.0118, 0.0116, 0.0187

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308969, time 13.39 loss 0.7534, classify loss 0.6933, regress loss 0.0128, 0.0136, 0.0152, 0.0184

Epoch 011 (lr 0.01000) Train: tpr 3.05, tnr 90.98, total pos 1770, total neg 5056, time 685.80 loss 0.7437, classify loss 0.6908, regress loss 0.0117, 0.0115, 0.0109, 0.0188

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307502, time 15.73 loss 0.7509, classify loss 0.6931, regress loss 0.0137, 0.0143, 0.0123, 0.0174

Epoch 012 (lr 0.01000) Train: tpr 2.82, tnr 90.51, total pos 1770, total neg 5056, time 691.66 loss 0.7444, classify loss 0.6908, regress loss 0.0109, 0.0119, 0.0119, 0.0190

Validation: tpr 21.35, tnr 67.03341454, total pos 178, total neg 7307927, time 13.23 loss 0.7555, classify loss 0.6937, regress loss 0.0161, 0.0112, 0.0158, 0.0186

Epoch 013 (lr 0.01000) Train: tpr 2.15, tnr 94.15, total pos 1770, total neg 5056, time 694.93 loss 0.7449, classify loss 0.6900, regress loss 0.0119, 0.0117, 0.0128, 0.0185

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7309196, time 16.12 loss 0.7499, classify loss 0.6935, regress loss 0.0160, 0.0114, 0.0113, 0.0176

Epoch 014 (lr 0.01000) Train: tpr 4.18, tnr 88.84, total pos 1770, total neg 5056, time 689.21 loss 0.7461, classify loss 0.6921, regress loss 0.0122, 0.0113, 0.0114, 0.0190

Validation: tpr 43.82, tnr 66.59337643, total pos 178, total neg 7307021, time 16.90 loss 0.7492, classify loss 0.6929, regress loss 0.0129, 0.0121, 0.0129, 0.0185

Epoch 015 (lr 0.01000) Train: tpr 8.47, tnr 78.42, total pos 1770, total neg 5056, time 697.79 loss 0.7456, classify loss 0.6915, regress loss 0.0121, 0.0116, 0.0122, 0.0183

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308198, time 15.78 loss 0.7498, classify loss 0.6931, regress loss 0.0136, 0.0130, 0.0118, 0.0183

Epoch 016 (lr 0.01000) Train: tpr 4.69, tnr 87.48, total pos 1770, total neg 5056, time 695.25 loss 0.7462, classify loss 0.6920, regress loss 0.0118, 0.0116, 0.0117, 0.0190

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307644, time 16.61 loss 0.7508, classify loss 0.6932, regress loss 0.0163, 0.0127, 0.0110, 0.0176

Epoch 017 (lr 0.01000) Train: tpr 12.60, tnr 69.13, total pos 1770, total neg 5056, time 688.74 loss 0.7450, classify loss 0.6923, regress loss 0.0117, 0.0112, 0.0113, 0.0185

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307985, time 16.13 loss 0.7540, classify loss 0.6933, regress loss 0.0136, 0.0122, 0.0141, 0.0208

Epoch 018 (lr 0.01000) Train: tpr 2.71, tnr 92.70, total pos 1770, total neg 5056, time 689.43 loss 0.7435, classify loss 0.6905, regress loss 0.0117, 0.0127, 0.0118, 0.0167

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308279, time 16.82 loss 0.7536, classify loss 0.6932, regress loss 0.0131, 0.0126, 0.0136, 0.0210

Epoch 019 (lr 0.01000) Train: tpr 11.69, tnr 66.08, total pos 1770, total neg 5056, time 680.30 loss 0.7457, classify loss 0.6927, regress loss 0.0116, 0.0118, 0.0111, 0.0186

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308733, time 15.69 loss 0.7529, classify loss 0.6933, regress loss 0.0132, 0.0140, 0.0140, 0.0185

Epoch 020 (lr 0.01000) Train: tpr 10.90, tnr 70.04, total pos 1770, total neg 5056, time 678.82 loss 0.7478, classify loss 0.6932, regress loss 0.0118, 0.0118, 0.0119, 0.0191

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308027, time 16.33 loss 0.7477, classify loss 0.6930, regress loss 0.0120, 0.0145, 0.0099, 0.0183

Epoch 021 (lr 0.01000) Train: tpr 20.79, tnr 49.19, total pos 1770, total neg 5056, time 679.92 loss 0.7473, classify loss 0.6936, regress loss 0.0116, 0.0114, 0.0119, 0.0188

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7307866, time 15.57 loss 0.7519, classify loss 0.6930, regress loss 0.0140, 0.0140, 0.0129, 0.0179

Epoch 022 (lr 0.01000) Train: tpr 1.36, tnr 95.87, total pos 1770, total neg 5056, time 681.69 loss 0.7424, classify loss 0.6895, regress loss 0.0118, 0.0113, 0.0114, 0.0184

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308549, time 16.37 loss 0.7493, classify loss 0.6932, regress loss 0.0119, 0.0122, 0.0133, 0.0187

Epoch 023 (lr 0.01000) Train: tpr 3.45, tnr 91.77, total pos 1770, total neg 5056, time 681.57 loss 0.7447, classify loss 0.6906, regress loss 0.0118, 0.0122, 0.0115, 0.0186

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308711, time 15.82 loss 0.7517, classify loss 0.6929, regress loss 0.0134, 0.0116, 0.0138, 0.0200

Epoch 024 (lr 0.01000) Train: tpr 5.25, tnr 86.83, total pos 1770, total neg 5056, time 685.38 loss 0.7464, classify loss 0.6918, regress loss 0.0119, 0.0117, 0.0120, 0.0191

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308701, time 16.39 loss 0.7527, classify loss 0.6927, regress loss 0.0135, 0.0145, 0.0143, 0.0177

Epoch 025 (lr 0.01000) Train: tpr 2.09, tnr 92.56, total pos 1770, total neg 5056, time 682.79 loss 0.7428, classify loss 0.6911, regress loss 0.0113, 0.0115, 0.0112, 0.0177

Validation: tpr 0.00, tnr 100.00000000, total pos 178, total neg 7308169, time 15.81 loss 0.7534, classify loss 0.6932, regress loss 0.0147, 0.0133, 0.0117, 0.0206

wentaozhu commented 3 years ago

Please continue training following the instruction. Lastly, you can use FROC to evaluate the model from epoch and select the best model. Thank you!

MHansy commented 3 years ago

Hello WentaoZhu

Kindly Assist me! When I am trying to visualize the predicted results by your model (The one in the detector), it responds very fast and can show images with bbox, BUT when I am trying with trained custom model takes long time with no feedback?

wentaozhu commented 3 years ago

Yes. The DeepLungDetectionDemo is the demo with the detection results. The models in detector folder are the trained model. You may debug first and check whether you can get the results for one image with positive nodules. If there is no any result, it might be the problem of the model.

MHansy commented 3 years ago

Yes. The DeepLungDetectionDemo is the demo with the detection results. The models in detector folder are the trained model. You may debug first and check whether you can get the results for one image with positive nodules. If there is no any result, it might be the problem of the model.

I am training the model in google collab GPU, and am using training batch size of 4, when it comes at the point of making prediction model can make prediction but to visualize the predicted results still takes time on running sometimes too 6hrs till I disconnect the runtime.

MHansy commented 3 years ago

1.This training trend of the classification model,...am I in the right way? 2.How to test the classification model?....Is it possible to make prediction then get output as probabilities of 0's and 1's then pass through roc curve?

image

wentaozhu commented 3 years ago

I am sorry that I do not have the log now. You can continue training and check when it is finished. I think batch size of 4 is small. You may use the original settings.