ubarsc / python-fmask

A set of command line utilities and Python modules that implement the ‘fmask’ algorithm
https://www.pythonfmask.org
GNU General Public License v3.0
75 stars 21 forks source link

100% clouds in mostly clear S2 images under "dark" illumination conditions #11

Closed gillins closed 7 years ago

gillins commented 7 years ago

Original report by Anonymous.


Dear developers,

I greatly appreciate your work and hope that you won't disapprove of this way of communicating with you.

I was very happy finding python-fmask and would like to use it operationally in a Landsat & Sentinel-2 processing chain. For that purpose I processed some S2A scenes in Northern & Southern Europe as well as India to get a feeling for the results. I also compared the python-fmask (0.4.3) results to Sen2Cor on S2A imagery. I was very satisfied with the python-fmask results over various summer images. In most cases it seemed more robust than the Sen2Cor cloud mask.

But I discovered that python-fmask seems to classify all winter scenes in Northern Europe as 100% cloudy even if mostly clear. Do you have any idea where this comes from or how to solve it?

It seems that other people had the same problem (http://forum.step.esa.int/t/sentinel-2-cloud-mask-with-fmask/4152/19).

Another question: Could you please tell me where to find the "cloud probability threshold" mentioned on the "original" Fmask website (https://github.com/prs021/fmask) if it exists? I don't find it in fmask-python. Maybe this could help me...?

At the moment I use python-fmask in anaconda on a Mac. I tried configuring several parameters (cloudBufferSize, shadowBufferSize, strictFmask, minCloudSize_pixels) which improved things, but not substantially.

Thanks a lot for your time!

I hope you don't feel offended by my way of contacting you.

Kind regards

Florian Schlenz (fschlenz@geocledian.com)

gillins commented 7 years ago

Original comment by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).


Hi Florian,

we are not at all offended by the contact, this is a perfect use of the issues page here. Thank you for your compliments about python-fmask, it is nice to know that it is useful to someone else.

I have not seen anything like the problem you describe. It sounds like it comes from the sun angle, but the highest latitude areas I have ever run it for are in Tasmania (Australia). It seems OK there, but Northern Europe is much higher latitude than Tasmania. Could you give an example of the location and date of a scene which gives trouble? Perhaps an MGRS tile name, and the date, and I will get the same data and test it.

The thresholds used internally in python-fmask are coded up in the same way as for the original papers by Zhu et. al. (2012 and 2015). I am not sure which threshold you are referring to, but most of that is not configurable. You may be correct that varying one of these thresholds could help with this problem, but I would prefer to understand the cause of the problem first, before recommending changes of this sort.

The main code for this is in the file fmask.py, and the comments in each part identify which section of the papers it refers to, with the equation numbers used there. This may help you to read through the code, and match up our code with their description.

Let me know a good example of this problem, and I will try to investigate.

Neil

gillins commented 7 years ago

Original comment by Florian Schlenz (Bitbucket: Florian_Schlenz, ).


Hi Neil,

thanks for your quick response! And thanks for offering support! So here are the S2 tiles that I processed with python-fmask that resulted in nearly complete cloudcover even though being partly clear:

This tile is on 56° lat in Denmark. It sounds reasonable to me that it might have to do with the sun angles. All summer scenes or scenes from lower latitudes worked fine (Denmark, Greece, India...).

I used your fmask version as is first. Later on I tried configuring it by changing several parameters which changed the result but didn't improve it substantially...

I could send images of my results later if you are interested...

Kind regards

Florian

gillins commented 7 years ago

Original comment by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).


Hi Florian,

thanks for that, I have downloaded the 2016-12-14 image, and done a little testing. From my initial look, the problem does seem to be due to the sun angle. The sun zenith for this image is 80 degrees, which means that the sun is very low on the horizon. Because the reflectance has been corrected for cos(sunZenith), as part of ESA's normal radiometric processing, the reflectance values are rather larger than they would be if the sun were higher in the sky. Reflectance blows up completely as sunZenith approaches 90 degrees, so these are near the limit of what is reasonable.

What this means in practical terms is that the Fmask thresholds are going to have to be adjusted a bit. Since the reflectances are larger than normal, much of the image appears bright enough to qualify as cloud.

Give me a few days to play with this, and work out which of the various thresholds are the relevant ones. Then I will make these available on the commandline, so they can easily be over-ridden when required.

cheers Neil

gillins commented 7 years ago

Original comment by Florian Schlenz (Bitbucket: Florian_Schlenz, ).


Hi Neil,

great, that sounds good! Thanks for your effort!

So, for Australia you don't have these problems and fmask-python is working reliably? Have you published any documentation of the performance of fmask-python? Like a comparison to other cloud masking algorithms on some Landsat or Sentinel scenes?

Do you know if this problem is the same for the Matlab version of fmask?

Cheers

Florian

gillins commented 7 years ago

Original comment by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).


Hi Florien,

OK, I have made some changes to the code for our version of Fmask, to allow the user to play with thresholds.

When I originally implemented this code, it was directly from the published papers about Fmask, and I had never actually read the MATLAB code. I don't have a MATLAB license, so that was never really going to be much use. However, it means that I was not aware that they had used the "cloud probability threshold" as a parameter to adjust the behaviour. In the paper, it does not have a name, and is simply written in as 0.2.

I tracked through the MATLAB code to understand it, and I have now given it a similar name, and made it accessible to the command line. Increasing this to 50% makes the example you gave me work out much better. The problem does arise because if the low sun angle, which results in the reflectance values being much larger than they really should be.

The same problem also affects the snow detection part of the algorithm, and so it maps lots of snow across this image. So, to help with this, I have also made a couple of the reflectance thresholds used for snow accessible on the commandline, too. They seem to be hard-coded in the MATLAB code, but increasing one or both of them seems to remove the false snow. I found that increasing both these to 0.3 was enough.

You will need to play with all these numbers, I guess, to work out the best values to use. I would recommend that you only need these when the sun is quite low, such as for these winter, northern Europe image, and you probably won't need to change from the defaults at other times.

I have created new .tar.gz and .zip files for a version 0.4.4 release of the source code. These will also make their way into the conda-forge pre-built binaries (eventually).

I would be grateful if you could do a test of these changes, and let me know if that helps, or if it creates any new problems.

thanks for letting me know of this problem. I hope it will help others, too.

cheers Neil

P.S. No, I have not published any comparisons. This code was intended to be a direct implementation of the published algorithm, although I have made a few minor changes to make it a bit more robust. In general, it should perform the same as their code.

P.P.S. Yes, it works fine in Australia. The algorithm does not work as well for Sentinel-2 as it does for Landsat, mostly because of not having a thermal band, so bright surfaces are sometimes mistaken for cloud. This will require new work of some sort, but I am not sure what.

gillins commented 7 years ago

Original comment by Florian Schlenz (Bitbucket: Florian_Schlenz, ).


Hi Neal,

thanks a lot for your work! I appreciate your effort a lot! I will try out the new code and let you know what happens. But this will take some time as I am very busy with other stuff at the moment...

Also great that you had a look at the cloud probability threshold! I didn't find it in the paper either, so I was wondering what it is...

Cheers

Florian

gillins commented 7 years ago

Original comment by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).


Marking as resolved. I did not hear any more, so I assume that the changes made were sufficient to satisfy the original reporter.