NASA-IMPACT / hls-foundation-os

This repository contains examples of fine-tuning Harmonized Landsat and Sentinel-2 (HLS) Prithvi foundation model.
Apache License 2.0
319 stars 83 forks source link

Blocky segmentation masks #55

Open aleksmirosh opened 9 months ago

aleksmirosh commented 9 months ago

Hello, thank you for your incredible work. This model shows great IoU on my data, much better than all I tested before. But even after finetune masks look blocky. Like here, like small squares:

mask

Do you have some suggestions please how I can deal with it? I finetuned with sen1floods11 config and checkpoint.

aleksmirosh commented 9 months ago

any suggestions?

HamedAlemo commented 9 months ago

@aleksmirosh we had another user trying this and having similar pattern in their prediction. The issue for them turned out to be that their input data was between [0, 1] but you need to multiply them by 10000 to be integers. cc @paolofraccaro

aleksmirosh commented 9 months ago

thank you @HamedAlemo for your response, I much appreciate it. I tried the model on satellite but not Sentinal data, it is already between 0 and 60000. I fond that this block is 16x16 size. Is this the problem with the encoder/decoder? Should I finetune it? I much appreciate your help with the issue

HamedAlemo commented 9 months ago

I think @thesujitroy or @paolofraccaro should be able to help with this.

CarlosGomes98 commented 8 months ago

Hi @aleksmirosh . This is somewhat confusing to me, as the predictions should be either 0 or 1 for each pixel, so completely dark or completely white, So I suspect at least some of this is due to distortion being introduced by compression in your image (are you saving it as JPEG perhaps?)

Its possible the output of the image is also being affected by the 16x16 patch size though, as you observed. You could try using a smaller patch size on your data, although it may take some more training epochs.

aleksmirosh commented 8 months ago

Hi @CarlosGomes98 thank you for the response. I much appreciate your time. My predictions are probabilities, I tested the original finetune script on the provided sentinel floods, and predictions were probabilities between 0 and 1 too. I attached mask I got after original finetune process on flood data.

mask_flood

For to my custom finetune the input is TIFF and I save it as TIFF. I tried different constants and normalizations, but the results are the same.

I will try to experiment with patch size. Does it make sense to try a bigger patch size? In this case, I must use batch size = 1, just memory limit.