forc-db / ForC

Global Forest Carbon Database
https://forc-db.github.io/
Creative Commons Attribution 4.0 International
54 stars 24 forks source link

Create framework to handle spatially nested and potentially conflicted plots #105

Open teixeirak opened 5 years ago

teixeirak commented 5 years ago

We need a structure to deal with two related issues: 1- Some sites or plots are nested within others. For example, Wind River Canopy Crane is within Wind River CTFS-ForestGEO plot. 2- In some cases, it may be time-consuming or impossible to discern whether two sites/plots are nested or duplicated.

Currently we handle potential site conflicts by treating area as a random effect in mixed effects models, but the ideal would be to treat these the same way we treat duplicate measurement records.

This issue is mostly a reminder to myself to come up with a framework to handle these.

ValentineHerr commented 2 years ago

Here is a thought. resolving site duplicates , supersites issues etc... is a huge undertaking and I don't know if we can come up with a n overall satisfactory plan.... Maybe the best approach is to, instead of (or in addiction to?) using area as a random effect in mixed effects models, use GLS models with a correlation function accounting for spatial-autocorrelation in the data....

teixeirak commented 2 years ago

I think this comment is broader than just this specific issue (relating to supersites, which are by definition nested, overlapping, or unnresolvable based on publications).

Without understanding completely how the GLS models would work, I think it would be a good idea for our analyses, but it doesn't solve the problem of sending data to IPCC, or making the database good for other users. It also neglects the fact that coordinates are not always very accurate.

The duplicates issue is obviously not something we can resolve overnight, or as part of the current project. I think the most helpful thing we could do is adjust the structure of ForC and how we enter data so that it would be much easier to resolve duplicates. Defining supersites was one thing that I feel helped this, as it allows easy grouping of duplicate sites by editing one file.

For duplicate sites that can be resolved based on the publications, It's often quite easy to tell when a site is a duplicate, but resolving it requires editing so many files... It would be nice if we could come up with a system where a single edit in the sites file could effectively merge two sites. Perhaps if we move away from using the site name to link across tables, and rather use site.ID? And then allow site.ID to link to more than one row in the sites table, where one record would be given precedence over the other?