ubarsc / python-fmask

A set of command line utilities and Python modules that implement the ‘fmask’ algorithm
https://www.pythonfmask.org
GNU General Public License v3.0
74 stars 21 forks source link

Default Landsat cloud and shadow dilations used by `python-fmask`? #56

Closed robbibt closed 1 year ago

robbibt commented 2 years ago

I'm trying to work out what dilations are used by default by python-fmask when it's applied to Landsat data. In the documentation I can see several different options where dilations are specified or mentioned:

https://www.pythonfmask.org/en/latest/fmask_fmask.html#fmask.fmask.maskAndBuffer https://www.pythonfmask.org/en/latest/fmask_fmask.html#fmask.fmask.matchShadows https://www.pythonfmask.org/en/latest/fmask_config.html#fmask.config.FmaskConfig.setCloudBufferSize https://www.pythonfmask.org/en/latest/fmask_config.html#fmask.config.FmaskConfig.setShadowBufferSize

My reading of this is that there seems to be a default 3 pixel buffer that is applied to shadows and perhaps clouds, as well as an additional optional (but default) 5 pixel cloud buffer and a 10 pixel shadow buffer, giving a total of 8 pixel buffer for clouds, and 13 pixel buffer for cloud shadow.

Does this sound correct?

neilflood commented 2 years ago

Hi @robbibt

No, that is close, but not quite right.

There are three buffering type operations applied. The first one is directly as prescribed in the original paper (Zhu et al, 2012, section 3.1.2, final paragraph, on page 87). This is implemented in the code in cloudFinalPass() https://github.com/ubarsc/python-fmask/blob/cd0d5b694f143018d4cae83878159ae1b318f547/fmask/fmask.py#L769-L772 It is not exactly a dilation operation, but similar in effect. I have not made this configurable, it is exactly as Zhu et al defined, with a square 3x3 window. It fills in little cloud holes and joins together cloud objects which are very close together and are probably the same cloud.

After this point, the resulting cloud objects are then used to generate shadow shapes, and match against dark objects in the image. So, it is relevant that this has been done already.

After clouds and shadows are matched, both are optionally buffered using a simple dilation with a circular kernel. The original paper specifies that only the shadows are buffered (Zhu et al, 2012, page 88, second last paragraph of section 3.2). I have implemented this as a simple circular buffer, with a configurable radius, which defaults to 10 pixels (considerably larger than their 3 pixel, 8-connected distance). In addition, I have also added a circular buffer on the cloud objects, with a default radius of 5 pixels. Both of these buffers counter the fact that both cloud and shadow detections are very poor at the edges, and often miss cloud or shadow contaminated pixels at the object edges.

So, in summary, there is just one buffer on the shadows, defaulting to 10 pixels on a circular kernel, and two separate buffer operations on the clouds, defaulting to 3 and 5 pixels respectively. The 3 pixel one is not configurable, but the 5 pixel one is.

The Landsat and Sentinel-2 command line scripts give these two configurable buffer sizes in metres, and convert based on the pixel sizes. This allows for consistency between the Landsat and Sentinel-2 scripts, which work in different pixel sizes, but use the same real-world distances for these buffers.

As we have discussed before (#45), the buffering on the shadow objects serves some particularly important purposes, and is worth having in some form.

I might leave this Issue open for a while, as it may remind me to try and make this a bit clearer in the various docstrings and comments. It should have been easier for you to see what is happening - apologies.

robbibt commented 2 years ago

Hi @neilflood - thanks so much for this, it's been really helpful! The documentation for python-fmask as a whole is top-notch, it was just the several similar looking params that I needed help clarifying. I'll have a further think and see if I have any further questions, but this is great for now!