GeoscienceAustralia / wagl

Python library for standardising satellite imagery into an Analysis Ready Data (ARD) form
Apache License 2.0
30 stars 7 forks source link

Water-Atcor #310

Open sixy6e opened 4 years ago

sixy6e commented 4 years ago

Atmospheric correction over water is desired by many users undertaking analysis such as shallow water bathymetry, water quality and sediment transport.

The current implementation of surface reflectance doesn't discriminate between land and water. As such, water bodies will contain a BRDF correction which technically should not be occurring.

The proposed method of generating surface reflectance over water requires an atmospheric adjacency correction which is resolved via a Point Spread Function (PSF) of the atmosphere. The PSF itself is calculated within MODTRAN.

Application of the PSF can be applied via standard convolution. However, as the kernel size of the PSF can be large, standard convolution can take a long time. Convolution via Fourier could provide an alternate solution but will require testing and comparison against standard convolution. As the PSF is designed to be symmetric and circular, ringing effects associated with convolution via Fourier should be minimised and potentially non-existent. But as mentioned, this will need to undergo a thorough comparative and follow-on application impact against the standard convolution method.

The following steps are (at a basic level) describe how the atmospheric adjacency correction is to be applied:

The implementation will be based on the work outlined in the following paper:

Required work (update as work is completed, or more is required):

sixy6e commented 4 years ago

Bands covering the smaller wavelengths (Blues to Reds) have higher diffuse radiation than bands covering the SWIR range, any SWIR bands will not be pushed through the aquatic atmospheric correction process. Bands covering the NIR spectrum will still (for the time being) be pushed through the aquatic atmospheric correction process.

sixy6e commented 4 years ago

New branch created for this work, until we're comfortable that it can be generically incorporated into the main workflow and merged into develop.

https://github.com/GeoscienceAustralia/wagl/tree/water-atcor

sixy6e commented 4 years ago

I think the calculation of lambertian reflectance can be moved out of the main FORTRAN routine. It is simple enough to do in Python, and better yet, make use of numexpr, and simplify the FORTRAN routine ever so slightly.

passangd commented 4 years ago

We need to map Point Spread Function derived by MODTRAN into acquisition's band id. As I understand, higher bands begins from the bottom of tp7 file, but we need clarity from either Fuqin or David that MODTRAN consistently outputs results as we expected. Currently, bands 1-8 for landsat and bands 1-12 have its PSF data derived from MODTRAN. But I am not clear on how each row from tp7 maps to the sensor's actual bands.

passangd commented 4 years ago

I have completed writing methods to derive the kernel based on PSF data and currently writing tests, I think I will have it pushed to remote branch by today evening. I guess either you or jez can take a look at it.

sixy6e commented 4 years ago

Yep send it through. @jeremyh is busy with Level-1 processing at the moment.

sixy6e commented 4 years ago

For the lambertian reflectance output, we'll only apply atmospheric adjacency correction, as well as sky glint correction.

Full glint correction is: gc = fs * sun_g + (1 - fs)*sky_g

As such we'll only incorporate the latter half of the equation: gc = (1 - fs)*sky_g

So lambertian reflectance with sky glint correction (lmb_rfl_sky_g): lmb_rfl_sky_g = lambertian_adjacency_correction - gc

passangd commented 4 years ago

If last equation is the final product, we could supply sky_glint as a separate data set? more flexible if user want to apply the full glint later

sixy6e commented 4 years ago

Yeah, you're right. That was the issue niggling at the back of my head when I was implementing the sky glint correction. A different sun glint model might work, but as you point out, it is probably simpler to simply provide the required data.

sixy6e commented 4 years ago

Another issue; the pseudo design of implementation is assuming a tight box around the satellite acquisition. i.e. there is a data value located on every row. Unfortunately, this is not the case. As such, doing a run-length average on rows with no data, will return a nan. For such rows, we could return a duplicate of the closest row that contains data?

sixy6e commented 4 years ago

I don't think that simply using numpy.isfinite and only submitting those rows that are True to the convolution function is a solution either. We would be assuming that there aren't any rows within the middle of an image containing NaN's. While this case is very unlikely, if it were to occur, the result would be very undesirable. But we can always test for this, and if it does occur, then we apply a different method. If not, then simply submit the subset through to the convolution function.

passangd commented 4 years ago

Any NAN would result in undesirable results ( half the length of filter size in x and y-direction with NAN values). Like you propose, we could try checking for any NaN values before we apply the filter function, but that will likely results in masking the NaN values without identifying why NaN exists in first place. Down the line, we should track why NaN occurs when we are handling in reflectance step.

passangd commented 4 years ago

".tp7" output from MODTRAN 5.4 outputs PSF data without higher band number from the bottom of the text file. This assumptions needs to be ascertained with Fuqin and David, in that case we can assume the acquisition's band_name with higher band number can be mapped to PSF data at the bottom of ".tp7" txt file.

sixy6e commented 4 years ago

I've updated the code so that the PSF output contained within the tp7 file also outputs the channel name. This will allow us to avoid any assumptions about the data being output.

sixy6e commented 4 years ago

In padding the data array as part of the preparation for applying the Fourier transform, I found that the symmetric method was more consistent with the standard image convolution than the reflect method for Sentinel-2. Landsat there was no difference. As such, the code has been updated to use the symmetric method for padding an array.