sstary / SSRS

Apache License 2.0
317 stars 38 forks source link

FTransUNet:about loss #15

Closed LPZliu closed 7 months ago

LPZliu commented 7 months ago

Your code is so poorly written and full of errors that I can't understand how you do experiments

sstary commented 7 months ago

Our framework follows vFuseNet. We try to open the code for reference and communication in remote sensing. We feel very sorry we couldn't help you.

LPZliu commented 7 months ago

I experimented with your code and found that the loss does not converge.

sstary commented 7 months ago

What dataset and method did you use? Did you modify any parameters?

LPZliu commented 7 months ago

FTransUnet

sstary commented 7 months ago

What dataset did you use? Did you modify any parameters? You can also post your log so I can get more details.

LPZliu commented 7 months ago

I didn't change anything.

sstary commented 7 months ago

Post your Log please?

LPZliu commented 7 months ago

Train (epoch 2/100) [500/1000 (50%)] Loss: 0.058644 Accuracy: 12.17041015625 Train (epoch 2/100) [600/1000 (60%)] Loss: 0.004680 Accuracy: 13.7115478515625 Train (epoch 2/100) [700/1000 (70%)] Loss: -0.006557 Accuracy: 10.174560546875 Train (epoch 2/100) [800/1000 (80%)] Loss: 0.026279 Accuracy: 26.76544189453125 Train (epoch 2/100) [900/1000 (90%)] Loss: 0.004766 Accuracy: 20.7855224609375

sstary commented 7 months ago

Full log?

LPZliu commented 7 months ago

Can you tell me first, is this normal?

sstary commented 7 months ago

The Loss is not normal. Loss will not be negative. There may be a problem with your loss function call. Here is some part of our log:

Train (epoch 1/50) [0/1000 (0%)] Loss: 1.895723 Accuracy: 9.1033935546875 Train (epoch 1/50) [100/1000 (10%)] Loss: 0.673334 Accuracy: 67.755126953125 Train (epoch 1/50) [200/1000 (20%)] Loss: 0.677673 Accuracy: 81.34307861328125 Train (epoch 1/50) [300/1000 (30%)] Loss: 0.391595 Accuracy: 98.43597412109375 Train (epoch 1/50) [400/1000 (40%)] Loss: 0.517425 Accuracy: 82.8216552734375 Train (epoch 1/50) [500/1000 (50%)] Loss: 0.452046 Accuracy: 80.8502197265625 Train (epoch 1/50) [600/1000 (60%)] Loss: 0.422012 Accuracy: 87.46185302734375 Train (epoch 1/50) [700/1000 (70%)] Loss: 0.372922 Accuracy: 99.94049072265625 Train (epoch 1/50) [800/1000 (80%)] Loss: 0.381555 Accuracy: 94.7784423828125 Train (epoch 1/50) [900/1000 (90%)] Loss: 0.316533 Accuracy: 99.99847412109375 ... Confusion matrix : [[4512260 146186 87941 40494 17484 0] [ 93421 5283845 37229 8452 0 0] [ 302859 104371 2273844 221561 75 0] [ 78559 13792 536412 3763997 0 0] [ 81411 3832 5 437 62992 0] [ 4096 835 1285 0 0 0]] 17677675 pixels processed Total accuracy : 89.93 roads: 0.9392 buildings: 0.9743 low veg.: 0.7834 trees: 0.8569 cars: 0.4237 clutter: 0.0000

F1Score : roads: 0.9137 buildings: 0.9628 low veg.: 0.7788 trees: 0.8932 cars: 0.5496 clutter: 0.0000 mean F1Score: 0.8196

Kappa: 0.8642 [0.84110029 0.92829925 0.63772029 0.80708317 0.37893116 0. ] mean MIoU: 0.7186

Train (epoch 2/50) [0/1000 (0%)] Loss: 0.459356 Accuracy: 91.77703857421875 Train (epoch 2/50) [100/1000 (10%)] Loss: 0.336380 Accuracy: 95.5841064453125 Train (epoch 2/50) [200/1000 (20%)] Loss: 0.366683 Accuracy: 84.0423583984375 Train (epoch 2/50) [300/1000 (30%)] Loss: 0.315925 Accuracy: 70.83587646484375 Train (epoch 2/50) [400/1000 (40%)] Loss: 0.424859 Accuracy: 87.4542236328125 Train (epoch 2/50) [500/1000 (50%)] Loss: 0.428774 Accuracy: 92.19512939453125 Train (epoch 2/50) [600/1000 (60%)] Loss: 0.272982 Accuracy: 87.36114501953125 Train (epoch 2/50) [700/1000 (70%)] Loss: 0.313880 Accuracy: 83.5968017578125 Train (epoch 2/50) [800/1000 (80%)] Loss: 0.346948 Accuracy: 85.5926513671875

LPZliu commented 7 months ago

i cannot address this question, could you help me?

LPZliu commented 7 months ago

Train (epoch 1/100) [0/1000 (0%)] Loss: 1.876903 Accuracy: 15.3594970703125 Train (epoch 1/100) [100/1000 (10%)] Loss: 1.853394 Accuracy: 9.1552734375 Train (epoch 1/100) [200/1000 (20%)] Loss: 1.839945 Accuracy: 25.08392333984375 Train (epoch 1/100) [300/1000 (30%)] Loss: 1.867182 Accuracy: 7.23419189453125

sstary commented 7 months ago

Sure, can you zip you files and send to my email? I will check your code.

LPZliu commented 7 months ago

I'm sorry. There's something new in this code that I can't give you. I've replaced the loss function, and I'm not getting negative values, but it's still not converging.

sstary commented 7 months ago

One way is remove your new codes and zip the files to me. And one way is you can remove the new codes and try to repeat our experiments with Vaihingen datasets. If your inputs are right and you use CE loss function, we think you should get a LOG similar to ours.

LPZliu commented 7 months ago

I didn't modify anything just added some small changes not to cause no convergence. Even I saw the loss appear nan

sstary commented 7 months ago

NAN and negative loss are usually caused by an incorrect loss function call. For example, your definition of the number of prediction categories, the order of predictions and real labels, etc.

Can you print part of the model output and the real label, as well as the code that shows the function call part?

sstary commented 7 months ago

You can replace our method with classical ResNet18, I think the loss would also be NAN or negative. The way can check if the model is wrong.

LPZliu commented 7 months ago

Can you tell me what your final output dimension is? Mine is B6256*256.

sstary commented 7 months ago

Yes, mine is also B C 256 * 256.

LPZliu commented 7 months ago

But I check the dimension of target is B256256

sstary commented 7 months ago

image Yes, the target is B 256 256.

LPZliu commented 7 months ago

Then there's no problem. There's nothing wrong with the output of my code. It's not the model.

sstary commented 7 months ago

OK, maybe you can check the dataset.

LPZliu commented 7 months ago

Can you send me your dataset?

sstary commented 7 months ago

Provide you email. I can send you some part of Vaihingen images to test your environment.

On the other hand, I think there still some problems in your loss call.

LPZliu commented 7 months ago

Maybe it's not the dataset that's the problem, it's the gradient that's exploding.

LPZliu commented 7 months ago

Train (epoch 1/100) [0/1000 (0%)] Loss: 1.902527 Accuracy: 28.118896484375 Train (epoch 1/100) [100/1000 (10%)] Loss: nan Accuracy: 3.22265625 Train (epoch 1/100) [200/1000 (20%)] Loss: nan Accuracy: 10.308837890625 Train (epoch 1/100) [300/1000 (30%)] Loss: nan Accuracy: 30.03082275390625 Train (epoch 1/100) [400/1000 (40%)] Loss: nan Accuracy: 40.362548828125 0%| | 0/4 [00:00<?, ?it/s] 0%| | 0/590 [00:00<?, ?it/s] 0%| | 0/600 [00:00<?, ?it/s] 0%| | 0/607 [00:00<?, ?it/s] 0%| | 0/617 [00:00<?, ?it/s] Confusion matrix : [[4804365 0 0 0 0 0] [5422947 0 0 0 0 0] [2902710 0 0 0 0 0] [4392760 0 0 0 0 0] [ 148677 0 0 0 0 0] [ 6216 0 0 0 0 0]] 17677675 pixels processed Total accuracy : 27.18 roads: 1.0000 buildings: 0.0000 low veg.: 0.0000 trees: 0.0000 cars: 0.0000 clutter: 0.0000

F1Score : roads: 0.4274 buildings: 0.0000 low veg.: 0.0000 trees: 0.0000 cars: 0.0000 clutter: 0.0000 mean F1Score: 0.0855

Kappa: 0.0000 [0.27177584 0. 0. 0. 0. 0. ] mean MIoU: 0.0544

Train (epoch 1/100) [500/1000 (50%)] Loss: nan Accuracy: 67.6605224609375 Train (epoch 1/100) [600/1000 (60%)] Loss: nan Accuracy: 1.66168212890625

LPZliu commented 7 months ago

Can you send me your dataset? kevin_ailover@163.com Thank you

sstary commented 7 months ago

Yes, the gradient is exploding. and I think it should be related to the loss function. I'll send you some data, and you can remove your new codes and try them to see if you still have this problem.

LPZliu commented 7 months ago

I chose to reproduce your code without any problems.

sstary commented 7 months ago

So do you need the Vaihingen dataset?

LPZliu commented 7 months ago

Through my tireless efforts I've finally solved the problem. Can you tell me the code for the rest of your paper's comparison method?

sstary commented 7 months ago

I'm glad you solved your problem. The rest of the comparison methods are mostly open source online.

LPZliu commented 7 months ago

I didn't change anything but it didn't run as well as your paper. How can I use your data in my own article.

sstary commented 7 months ago

Did you load the pre-trained weights?

LPZliu commented 7 months ago

I don't think that's fair, using other methods doesn't have pre-trained models for large languages

sstary commented 7 months ago

All methods in our experiments are used with pre-trained weights.

LPZliu commented 7 months ago

When I quote your posts, can I run as much as I want. Not quoting with your article data because my models all don't use pre-training

sstary commented 7 months ago

Of course. This work mainly introduces an approach to introduce VSS module into remote sensing field, and there are imperfections. We look forward to your research in your environment.