bopen / c3s-eqc-toolbox-template

CADS Toolbox template application
Apache License 2.0
5 stars 4 forks source link

WG5 Deliverable: fAPAR_LAI #35

Closed Cri-dei closed 1 year ago

Cri-dei commented 1 year ago

Notebook description

Spatial coverage of daily satellite fAPAR and LAI data considering the year 2014.

Use questions: Considering the 2014, fAPAR and LAI satellite data are completely available for all the Europe or there exists some region with less data coverage?

Methods:

     - Data download cutting the world map on Europe
     - Calculate the percentage of missing values for each grid cell
     - Mask the dataframe considering only inland data
     - Plot the map of percentage of missing values for the selected period
     - Plot the map and histogram of missing values for Europe

Notebook link or upload

fAPAR_ LAI_Completeness.zip

Anything else we need to know?

Hi, Here I want to show the map of missing values with the histogram besides, considering just Europe. I am using the same code and shapefile we used for soil moisture. I have a problem with the coordinate system and I am not able to plot it with the code we already wrote. Moreover, here I am using the same function for missing data that we used for soil moisture, I think can be optimized, I did not managed.

Environment

Cri-dei commented 1 year ago

Hi @malmans2 ,

Can you help me with this code? It is the only one i still need for the deadline that is not optimized. Thank you.

malmans2 commented 1 year ago

Hey, I'll work on this tomorrow. I've been working on other notebooks these days.

malmans2 commented 1 year ago

Hi @Cri-dei,

I'm not sure I understood the problem with this notebook. You say that you want to plot the histogram of missing values, but then you try to plot ds["fAPAR"] (that doesn't work because it's a 3D variable time, lat, lon - you'd have to reduce the time dimension somehow, like an average).

Are you trying to do this?

missing_values_count.rio.write_crs("epsg:4326", inplace=True)
imshow_and_hist(missing_values_count, world_shape[world_shape.CONTINENT == 'Europe'])

image

malmans2 commented 1 year ago

It's all working OK for me, tested on the VM. This is the full notebook I've been running. It should be the same code you used in the soil moisture example, so I might add a generalised plotting function in the future, so you can access imshow_and_hist directly from c3s_eqc_automatic_quality_control.

# ## Import packages

# In[1]:

import warnings

import geopandas
import matplotlib.pyplot as plt
import shapely.geometry
from c3s_eqc_automatic_quality_control import download, utils

warnings.filterwarnings("ignore")
plt.style.use("seaborn-v0_8-notebook")

# ## Define variables

# In[2]:

# Request parameters
year_start = 2014
year_stop = 2014
nominal_days = [3, 13, 21, 23, 24]
variables = ["fapar"]

# Region
lon_slice = slice(-13, 35)
lat_slice = slice(72, 30)

shapefile_url = "https://figshare.com/ndownloader/files/23392280"

# ## Set the data request

# In[3]:

# Define request
collection_id = "satellite-lai-fapar"
request = {
    "satellite": "proba",
    "sensor": "vgt",
    "horizontal_resolution": "1km",
    "product_version": "V2",
    "year": [str(year) for year in range(year_start, year_stop + 1)],
    "month": [f"{month:02d}" for month in range(1, 12 + 1)],
    "nominal_day": [f"{day:02d}" for day in nominal_days],
    "format": "zip",
    "variable": variables,
}

# ## Download and preprocess data

# In[4]:

# Download and cache
ds = download.download_and_transform(
    collection_id,
    request,
    transform_func=utils.regionalise,
    transform_func_kwargs={
        "lon_slice": lon_slice,
        "lat_slice": lat_slice,
    },
    chunks={"year": 1, "nominal_day": 1, "variable": 1},
    combine="nested",
    concat_dim="time",
)

# Shapefile
world_shape = geopandas.read_file(shapefile_url)

# ## Define plotting function

# In[5]:

def imshow_and_hist(da, shape):
    """Plot map and histogram side-by-side.

    Parameters
    ----------
    da: DataArray
        DataArray to plot
    shape: GeoDataFrame
        Geopandas object with polygons

    Returns
    -------
    figure, axes
    """
    fig, (ax_imshow, ax_hist) = plt.subplots(
        1, 2, figsize=[10, 5], gridspec_kw={"width_ratios": [3, 2]}
    )

    da = da.rio.clip(
        shape.geometry.apply(shapely.geometry.mapping),
        shape.crs,
        drop=True,
    )
    da.plot.imshow(ax=ax_imshow)
    ax_imshow.set_title("Map")

    da.plot.hist(bins=50, ax=ax_hist)
    ax_hist.set_ylabel("Frequency")
    ax_hist.yaxis.set_label_position("right")
    ax_hist.yaxis.tick_right()

    # Compute and show no data percentage
    missing_data_perc = (da == 100).sum() / da.notnull().sum() * 100
    ax_hist.set_title(
        f"Percentage of area with missing data: {float(missing_data_perc):f} %"
    )

    fig.suptitle(", ".join(list(shape.CONTINENT)))
    return fig, (ax_imshow, ax_hist)

# In[6]:

da_mvc = ds["fAPAR"].isnull().sum("time") / ds.sizes["time"] * 100
da_mvc.attrs["long_name"] = "Missing values"
da_mvc.attrs["units"] = "%"

da_mvc.rio.write_crs("epsg:4326", inplace=True)
imshow_and_hist(da_mvc, world_shape[world_shape.CONTINENT == "Europe"])
Cri-dei commented 1 year ago

Yes, i had problem exactly there with the write_crs command. Now it is working, thanks. We can close it.