forc-db / GROA

This repository houses data and code for the Global Reforestation Opportunity Assessment (GROA) led by Susan Cook-Patton of the Nature Conservancy.
Creative Commons Attribution 4.0 International
31 stars 10 forks source link

compare ForC and GROA sites #14

Closed ValentineHerr closed 4 years ago

ValentineHerr commented 6 years ago

@teixeirak , The file with data for both GROA and ForC sites when sitenames are identical is here.

ValentineHerr commented 5 years ago

@CookPatton, Is it possible that you have duplicated sites with different site.ID in sitesf.csv? like:

Thanks for your help!

CookPatton commented 5 years ago

@ValentineHerr - you raise an interesting question.

Site.ids 293/2417 and 2414/100/3817 have the same site name and a very similar - but importantly not the same! - geolocation. My rule was that a unique geolocation received a unique site.id number. I did not set a threshold such that two sites within a specific distance from each other should be lumped.

@teixeirak too - what did you do for ForC? Do we need some sort of adjustment here?

teixeirak commented 5 years ago

In ForC they would have a separate plot name, and they would fall within the same geographic.area and be lumped in any of the any of the analyses we've run so far.

ValentineHerr commented 5 years ago

@CookPatton , sorry if you already answered this somewhere but can you remind me why some sites have same site.id but different study.id? Most of them, but not all, have same coordinates and some of them have different site.name.

CookPatton commented 5 years ago

@ValentineHerr I think I emailed you a response separately because I was traveling so no worries about asking again via Github. Sometimes multiple papers had data from the same site (same geolocation) so to avoid pseudoreplication, I gave them the same site identifier.

They should all have the same coordinates.

ValentineHerr commented 5 years ago

Hmm... @CookPatton, I have the following site.id with different coordinates:

Also, is there a reason why, for example, site.id 100 has site.sitename "Luquillo Experimental Forest" in sitesf.csv but "COMPARISON OF TROPICAL TREE PLANTATIONS WITH SECONDARY FORESTS OF SIMILAR AGE" in nonsoil_litter_CWD.csv ?

ValentineHerr commented 5 years ago

I wrote a more complete issue about this here : #17

CookPatton commented 5 years ago

@ValentineHerr I just got your issue about site.sitename being different in sitesf.csv and nonsoil_litter_CWD.csv. That's another error on my end. nonsoil_litter_CWD.csv pulled the title of the paper rather than the site name. I'll fix that on my end too.

ValentineHerr commented 5 years ago

Ok, thanks @CookPatton, let me know when you push the updated version. And just because I am looking at it just now, can you explain this:

in sitesf.csv:

site.id study.id site.sitename
46 9020 Eastern Para 1
47 9020 Eastern Para 2
48 9020 Eastern Para 3

But in nonsoil_litter_CWD.csv:

site.id study.id sites.sitename
46 9020 Eastern Pará 1
46 9020 Eastern Pará 2
46 9020 Eastern Pará 3
47 9020 Eastern Pará 3
47 9020 Eastern Pará 1
47 9020 Eastern Pará 2
48 9020 Eastern Pará 3
48 9020 Eastern Pará 2
48 9020 Eastern Pará 1
CookPatton commented 5 years ago

The nonsoil_litter.csv site.id was jumbled. You caught an error. I fixed it on my end.