AMF Parsing: How to deal with sites missing variables?

nocollier / PEcAnILAMB

Assets for collaborative work linking PEcAn and ILAMB

2 stars 1 forks source link

AMF Parsing: How to deal with sites missing variables? #3

Open serbinsh opened 4 years ago

serbinsh commented 4 years ago

After running the synthesis for AGU 2019 it became clear that the raw AMF data available from Berkeley has a lot of sites that are missing specific key variables (e.g. LH, SH, NEE, GPP, Reco etc). We would like to use the longest time-series possible with the newest data and as many northern temperate/boreal locations as possible for the first manuscript. How best to deal with missing data/variables? Calculate on-the-fly in the AMF parsing code? If so, how best to do this and what variables will be required? Do we leverage existing code to do this such as EddyProc?

serbinsh commented 4 years ago

Also are there uncertainties we can attach to the AMF data and use in the comparisons/scoring?

nocollier commented 4 years ago

It is discouraging that relatively few sites provide all the variables, but I do not understand the issues well enough to provide any idea. Do we just do what we can with what is there? Or is that really insufficient to get any meaningful science out? Tagging Forrest @climate-dude so he sees what is going on.

serbinsh commented 4 years ago

Right. Another approach could be to use a dataset like the La Thuile dataset https://fluxnet.fluxdata.org/data/la-thuile-dataset/ . The downside is that we can illustrate the capacity to scrape data from AMF and re-run the analysis in a more automated way. If AMF data was more consistent it would be easier to develop a more automated benchmarking system that can be continually updated.....but we could do a blend of this where we show the ability to automate but perhaps focus the analysis more on the standardized dataset and highlight the limitations? I would have to remind myself again on where and what is available from that standard data.

nocollier commented 4 years ago

I like your thinking about trying to use AMF and show how observations could be more frequently integrated, but if other data gives us more sites with variables we need, better.

serbinsh commented 4 years ago

Also other options: OneFlux - https://ameriflux.lbl.gov/data/download-data-oneflux-beta/

La Thuile - https://fluxnet.fluxdata.org/data/la-thuile-dataset/

serbinsh commented 4 years ago

FYI on the OneFlux site

ONEFlux processing produces gap-filled, partitioned flux/met data that are fully consistent with the FLUXNET2015 data product. FLUXNET included data through 2014. Read more about ONEFlux codes

We are making available a beta ONEFlux product with data through 2018 for a limited number of AmeriFlux sites in advance of a larger release in 2020. The purpose of this early release is to allow users to play with the data and develop analysis capabilities. As a Beta product, we expect there will be data revisions before the final data product is released. Please use the non-Beta data release (Spring 2020) for publication research.

serbinsh commented 4 years ago

Some sites via NEON: https://www.neonscience.org/data-collection/flux-tower-measurements

NEON data portal: https://data.neonscience.org/home

serbinsh commented 4 years ago

And there is EddyProc for flux partitioning: http://www.bgc-jena.mpg.de/~MDIwork/eddyproc/index.php