OLCI reflectance values don't match values from SNAP

system123 commented 1 year ago

Describe the bug I have two OLCI L2 WFR products

S3B_OL_2_WFR____20230327T075434_20230327T075734_20230328T175956_0179_077_320_3600_MAR_O_NT_003.SEN3 S3B_OL_2_WFR____20230327T075134_20230327T075434_20230328T175940_0179_077_320_3420_MAR_O_NT_003.SEN3

In my old pipeline I used SNAP GPT to mosaic these products together (there is limited overlapping between them) and then I write the mosaic out to a NetCDF file. In my new pipeline, I load the same two products with Satpy, however, the reflectances do not match with those produced by SNAP.

scn = Scene(reader='olci_l2', filenames=filelist)
data = scn.load('Oa03')

When computing the max value I get the following (I chose a small area at random where all pixels are valid):

Raw OLCI files: 0.26 Satpy Loaded: 0.1912 SNAP mosaic: 0.0839

The second issue is that the SNAP mosaic has more valid pixels while that returned by SatPy appears to mask out many more pixels which are deemed invalid.

Screenshots Satpy loaded band Oa03

Band from the SNAP mosaic Oa03

Environment Info:

OS: OSX
Satpy Version: 0.43
Python: 3.9.12

simonrp84 commented 1 year ago

I just tried to reproduce this: Processing the first of the two granules you list, the values produced by Satpy exactly match those in the raw data file. I also see the same invalid pixels between satpy and the raw data.

That said, I have never used SNAP - is it possible that SNAP is doing some kind of additional post-processing?

mraspaud commented 1 year ago

from what I see SNAP seems to be filtering noise out. We have this capability in satpy now also I think, but can you get a hold of exactly what SNAP is doing to the data?

system123 commented 1 year ago

I recreated my conda environment and now I can confirm that satpy values do match the raw OLCI .nc files (not sure what the issue was but it seems fixed by reinstalling everything). However, there is still some difference between the SNAP and Satpy values although it is now much less extreme and seems to be that SNAP clip values and mask the data differently.

I have used SNAP to "mosaic" a single file (the first one listed above) and then plotted the distribution of the reflectance values in each pixel.

Code for the first plot:

SYM_MASK = False
X = data_snap['Oa03'].values.copy()
Y = data_satpy['Oa03'].values.copy()
Y[data_satpy['mask'].values] = np.nan

if SYM_MASK:
    X[data_satpy['mask'].values] = np.nan
    X[Y <= 0] = np.nan
    Y[X <= 0] = np.nan

Y[Y <= 0] = np.nan
X[X <= 0] = np.nan

plt.figure(figsize=(10, 10))
_=plt.hist(np.clip(X.ravel(),0,0.05), bins=100, alpha=1, label='SNAP')
_=plt.hist(np.clip(Y.ravel(),0,0.05), bins=100, alpha=0.5, label='SatPy')
plt.legend()

Distribution of all valid pixels (> 0 and not masked) in SNAP mosaic and SatPy loaded image (reflectances clipped to [0,0.05])

Distribution of reflectances after symmetric masking (mask all invalid pixels in both images and clipping references to [0,0.05])

Scatter plot of the symmetrically masked pixels (x-axis = SNAP, y-axis=SatPy):

I have also asked on the STEP forum if someone could share some more insights on the SNAP processing chain for OLCI, once I have an answer I'll revert back here.

Note [Edit]: If I don't clip the reflectance values then there is a difference between the SNAP and SatPy products even after symmetrically masking them.

mraspaud commented 1 year ago

@system123 interesting, thanks a lot for digging into this! Do I understand correctly then that the mask in satpy and Snap are different then?

system123 commented 1 year ago

In summary, these are the main differences:

SNAP appears to have many more pixels <= 0 (See figure below where I set all <0 pixels to 0 instead of np.nan as above)
SatPy loaded data has a longer right and left tail in the loaded data (most likely these pixels are masked by SNAP).
In low reflectance values there seems to be a disagreement between SNAP and SatPy reflectance values, the MAPE (assume SNAP to be the ground truth) is around 11%.

MAPE = np.nanmean(np.abs((X.ravel() - Y.ravel()) / np.clip(X.ravel(), 1e-6, X.ravel().max()))) * 100

system123 commented 1 year ago

I have made some more progress on this – there is definitely a difference in the reflectance values between SNAP and SatPy, but I have figured out the masking differences. Using the following set of flags I get more comparable results compared to the DEFAULT_MASK_ITEMS specified in olci_nc.py reader (Added TURBID_ATM, and removed OCNN_FAIL):

flags =["INVALID", "SNOW_ICE", "INLAND_WATER", "SUSPECT",
                      "AC_FAIL", "CLOUD", "HISOLZEN", 
                      "CLOUD_MARGIN", "CLOUD_AMBIGUOUS", "LOWRW", "LAND","TURBID_ATM"]

To do this I needed to make some modifications to the reader, as the TURBID_ATM flag is not in the default flag_list of the BitFlags class. I instead loaded the flag list directly from the image metadata. I'll clean up my code an submit a PR for this so that at least all flags can be used with reader_kwargs. Apart from that I don't think we need to change the default operation.

SNAP Image

SatPy image (there are some minor differences in some areas, but I cannot find the cause of those)

However, even though the distribution of the data looks better now with the improved masking, there is still around a 11% error in the reflectance values between SNAP and SatPy

mraspaud commented 1 year ago

Thanks a lot for sorting out the flag issues! Regarding the reflectance difference, snap must be doing something to the data. I the satpy reader, we are not touching level 2 reflectances values at all. Do you know if the documentation of the format mentions something to be done on the reflectances before consumption?

pytroll / satpy

OLCI reflectance values don't match values from SNAP #2546