Reproducing paper results on fine tuning

robmarkcole commented 11 months ago

Prithvi-100M-burn-scar reports the results:

IoU of 0.73 on the burn scar class and 0.96 overall accuracy

However when I run the config here the results are worse:

+--------------+-------+-------+
|    Class     |  IoU  |  Acc  |
+--------------+-------+-------+
| Unburnt land | 96.31 | 97.62 |
|  Burn scar   | 70.26 | 86.64 |
+--------------+-------+-------+

+------+-------+-------+
| aAcc |  mIoU |  mAcc |
+------+-------+-------+
| 96.6 | 83.28 | 92.13 |
+------+-------+-------+

Training finished successfully.

Can this be attributed to these runs being non deterministic, or was another config used for the official results? Alternatively could this be due to a diff in the codebase?

Thanks

robmarkcole commented 10 months ago

Similarly for the crop classification using the supplied config, I've achieved slightly lower metrics than those reported, although I did have to reduce the batch size.

2024-01-02 16:56:40,546 - mmseg - INFO - Saving checkpoint at 80 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 772/771, 27.4 task/s, elapsed: 28s, ETA:     0s

2024-01-02 16:57:16,366 - mmseg - INFO - per class results:
2024-01-02 16:57:16,369 - mmseg - INFO - 
+----------------------+-------+-------+
|        Class         |  IoU  |  Acc  |
+----------------------+-------+-------+
|  Natural Vegetation  |  39.3 | 45.26 |
|        Forest        | 47.29 | 65.95 |
|         Corn         | 54.28 | 65.21 |
|       Soybeans       | 52.13 | 67.92 |
|       Wetlands       | 40.53 | 58.75 |
|   Developed/Barren   | 35.15 | 55.91 |
|      Open Water      | 67.75 | 90.23 |
|     Winter Wheat     | 48.56 | 67.34 |
|       Alfalfa        | 30.09 | 64.33 |
| Fallow/Idle Cropland | 33.12 | 60.09 |
|        Cotton        | 32.08 | 62.63 |
|       Sorghum        | 33.03 | 72.34 |
|        Other         | 33.87 | 47.45 |
+----------------------+-------+-------+
2024-01-02 16:57:16,369 - mmseg - INFO - Summary:
2024-01-02 16:57:16,369 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 60.07 | 42.09 | 63.34 |
+-------+-------+-------+

Vs

HamedAlemo commented 10 months ago

Hi @robmarkcole, in the crop classification case we have observed a similar pattern that if you reduce the batch size the performance is slightly lower. cc @hanxLi

NASA-IMPACT / hls-foundation-os

Reproducing paper results on fine tuning #45