Open FabriceSamonte opened 3 years ago
@adrian-g-fisher maybe comment here, so we don't lose your comment.
- @mitchest , I'm currently missing the standard deviation data of the landsat surface reflectance values within a given 3x3 grid. Does Google EE have built-in functionality for this, or would I have to manually calculate those values (ie. create a function for standard deviation with a list of numbers as an input)?
As usual, there's a number of ways to do it...
.reduce()
method with ee.Reducer.stdDev()
as the reducer. It can be applied to a number of types, including feature collections, lists, images.set()
the values to the feature collection then aggregate over it (e.g. .aggregate_mean()
and .aggregate_total_sd()
.reduceNeighborhood()
that is extremely efficient, in which you specify the reducer (e.g. ee.Reducer.stdDev()
) and a kernel (e.g. ee.Kernel.square()
). That method applies to each pixel in an image, but thanks to GEE's lazy eval, it will only calculate the neighbourhoods being requested by the feature collection or geom you use to extract values.
- What are the most appropiate date ranges to select for each dataset?
Probably a good question for @adrian-g-fisher, since he'll have dealt with this before. A month each side is reasonable, but we might need to go a little more to get enough good cloud free images at certain times of year.
- Is extracting the date of the image used important? If so, I'm assuming I would have to extract data from a single raw image based on cloudiness etc. instead of using a median summary.
If we go with the "geomedian" approach, the individual dates are not so important - would be nice to get the actual number of image pixels the median values were derived from, but not exactly sure what we'd do with it yet (perhaps to check for bias?). Via the reduce()
on the image collection, you'd just use ee.Reducer.count()
as well as eeReducer.mean()
Thanks @mitchest, awesome info! I definitely missed aggregate_total_sd()
, but it would be super useful with what I currently have now, since I'm already applying aggregate_mean()
for the gridded FeatureCollection. I'll also have a look into reduceNeighborhood()
, since my compute times are quite lengthy.
A couple of quick comments. I'm away with the family this week, so typing badly on my phone.
I wouldn't worry about the MODIS imagery.
Peter extracted values from the nearest cloud free Landsat image, rather than using a composite. The date of the image is in the image name of the cab file in yyyymmdd. I think he may have used the time difference between field and satellite as a weighting.
Pretty sure Peter would have made zeros 0.01 and ones 0.99, although it is possible for the field measurements to be 100% one cover type.
@adrian-g-fisher the data stored in fractional_calval_filename
correct? I'll do some work reformatting the date and extracting from the exact image they used.
Also, which Sentinel-2 dataset were you after in particular? EE provides:
Yes, the date in the calval filename.
Sentinel2 MSI level 2a should be a surface reflectance product similar to the Landsat.
Hi all,
Just thought I'd post an update on my progress. I've managed to extract Google EE's landsat (5, 7 and 8), MODIS 16-day nadir BRDF-Adjusted Reflectance (MCD43A4), and MODIS 8-day surface reflectance (MOD09A1) using the given locations of each field observations from star transects (data : https://field.jrsrp.com/). A version of the csv can be found under
data/star_transects_google_ee.csv
. The new columns in are as follows:l5_b1_sf_mean
- landsat 5 band 1 surface reflectance average for the surrounding 3x3 grid. I went with a brute force approach so there are columns per landsat image (5, 7 and 8) per band.MOD09A1_sur_refl_b01
- MOD09A1 band 1 surface reflectance band value as extracted from the 500m resolution pixel - there are respective columns from bands 1-7Nadir_Reflectance_Band1
andBRDF_Albedo_Band_Mandatory_Quality_Band1
- data extracted from MCD43A4 (500m resolution), in which there are also respective columns for bands 1-7.METHOD
There is a lot of repeated code in
earth_engine/extract.js
, so I'll highlight the important bits here.Extracting MCD43A4 and MOD09A1
For each location and its observation date, I extract the pixel value at 500m resolution from MCD43A4 and MOD09A1.
For MOD09A1, I filtered the image collection within a 16-day (wasn't too sure as to what window to use) period around the observation time, and applied a mask for cloudy pixels, cloud shadows, cirrus clouds and pixels that have low or no aerosol loading:
Given the new image collection, I created a median summary of all images (median value per band for each pixel, from multiple images), and extract the raw value of each band within the resultant image. This is point where the methodology differs from the paper https://doi.org/10.1016/j.rse.2015.01.021, where they select a single raw image from a collection of images within a 8-day window, based on high observation coverage, cloud free, aerosol loading etc. as opposed to using a median summary. I wasn't too sure as to how to go about comparing these values between MODIS images within the collection.
I extracted MCD43A4 using the same method as above, except without any masking.
Extracting Landsat
For Landsat 5, 7, and 8, I filtered the image collection based on the location of the observation and a 60-day window from the time of the observation. I used Google EE's built cloud mask function to mask out any cloudy pixels. Similar to extracting MOD09A1, I created a median summary of the filtered image collection to extract values from.
For each observation, I also created a 3x3 grid 30m grid to extract values from, like so,
where
getLonLatOffset
function returns longitude and latitude 30 meters vertically or horizontally offsetted from a given location. By extracting landsat values for each cell from the 3x3 grid, I was able to aggregate the effective surface reflectance mean of each band. I repeated this for collections landsat 5, 7 and 8.Final dataset was created like so:
A couple questions: