ropensci / allodb

An R package for biomass estimation at extratropical forest plots.
https://docs.ropensci.org/allodb/
GNU General Public License v3.0
36 stars 11 forks source link

Deal with missing species-codes #43

Closed maurolepore closed 6 years ago

maurolepore commented 6 years ago

Relates to https://github.com/forestgeo/allodb/issues/36.

For few sites, which species list I got from ForestGEO website I don't have a code.

Here are some ways I think we may deal with this issue:

  1. Throw a warning indicating the species that match the user's data for which allodb has no species-code. The match may be by species name -- if the user provides spcies data -- or by site or region, in which case we can list all species in that site or region with unknown species-code.

  2. Provide some way of filling the gap, e.g. ask the user to provide a table with species-names and species-codes.

Could the opposite problem happen? That is, can the user user provide codes that don't match any code in allodb? We could deal with this in a way similar to that described above.

gonzalezeb commented 6 years ago

To remember: we need the species names from view_taxonomy table because sites may use R tables only to run allodb (the package could store the view tax table of all sites, request from Suzanne).

maurolepore commented 6 years ago

@laosuz,

(@gonzalezeb cc')

Is it possible to get the ViewTaxonomy tables from all sites? We are working on a project that needs users to provide species names and DBH. We could ask them a stem or tree table AND a ViewTaxonomy table but we could make their lives easier if we already have the ViewFullTables from everywhere and we asked the users to only provide census data.

laosuz commented 6 years ago

Hi Mauro,

I can provide them to you only after every principal investigator authorizes it, or if Stuart says it is OK. Please ask Stuart first.

Best, Suzanne

From: Mauro Lepore notifications@github.com Sent: Tuesday, September 25, 2018 6:26 PM To: forestgeo/allodb allodb@noreply.github.com Cc: Lao, Suzanne LAOZ@si.edu; Mention mention@noreply.github.com Subject: Re: [forestgeo/allodb] Deal with missing species-codes (#43)

@laosuzhttps://github.com/laosuz,

(@gonzalezebhttps://github.com/gonzalezeb cc')

Is it possible to get the ViewTaxonomy tables from all sites? We are working on a project that needs users to provide species names and DBH. We could ask them a stem or tree table AND a ViewTaxonomy table but we could make their lives easier if we already have the ViewFullTables from everywhere and we asked the users to only provide census data.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/forestgeo/allodb/issues/43#issuecomment-424534460, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Ac9rrquZsfXlt7lJyzWuxkY2an5ybLCTks5ueruQgaJpZM4WyVBp.

gonzalezeb commented 6 years ago

@maurolepore We actually have spcode for multiple sites (a list provided by Suzanne back in April) so we may not need the ViewTaxonomy tables. For sites that I have no sp codes, I will contact PI's directly, I think there are very few (maybe ~5).

@laosuz. Is that ok with you Suzzane? I won't publish the list, I will subtract the spcodes for the sites that have confirmed participation in our paper (33 sites up to date) or for those which species list are public on the ForestGEO website...

laosuz commented 6 years ago

Hi Erika,

I don’t recall what species list I gave you, but yes, please do not publish it unless the Pis from those sites have authorized you to use the list. With respect to those on the ForestGEO website, I’m not completely sure if they are all public, but as long as they are participating in your research, it should be OK.

Unfortunately, we have had complaints from PIs that we give out data without the PI’s knowledge, even if the data has been published. We have had complaints too about publishing species lists on the website without their consent. So do make sure that they know that you have their species list.

Thanks! Suzanne

From: Erika Gonzalez-Akre notifications@github.com Sent: Wednesday, September 26, 2018 3:36 PM To: forestgeo/allodb allodb@noreply.github.com Cc: Lao, Suzanne LAOZ@si.edu; Mention mention@noreply.github.com Subject: Re: [forestgeo/allodb] Deal with missing species-codes (#43)

@mauroleporehttps://github.com/maurolepore We actually have spcode for multiple sites (a list provided by Suzanne back in April) so we may not need the ViewTaxonomy tables. For sites that I have no sp codes, I will contact PI's directly, I think there are very few (maybe ~5).

@laosuzhttps://github.com/laosuz. Is that ok with you Suzzane? I won't publish the list, I will subtract the spcodes for the sites that have confirmed participation in our paper (33 sites up to date) or for those which species list are public on the ForestGEO website...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/forestgeo/allodb/issues/43#issuecomment-424860590, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Ac9rroZO2RIi5WkPUl231MWSAoWpGV4jks5ue-UVgaJpZM4WyVBp.

tylerlittlefield commented 6 years ago

@maurolepore I'm not sure if this helps any but point number 2 sounded very familiar to me. i-Tree provides a list of species along with their species code and region. I use it to calculate eco benefits and it requires guessing the species codes for user data that can't immediately be joined to i-Trees list. So if you need data that can help fill in the gap, might be worth while to check out i-Tree.

maurolepore commented 6 years ago

Thanks @tyluRp for pointing us in that direction. @gonzalezeb, you may have more background to assess if i-Tree tools would be useful, although I'm afraid ForestGEO codes may be too idiosyncratic?

gonzalezeb commented 6 years ago

Thanks @tyluRp for pointed us to this tool, I didn't know about it. I still don't know how to deal with this issue as species codes are unique to ForestGEO sites (is it not a unique identifier for species in our dataset)..

But maybe Mauro's point 1 is the way to go: user provides species name->match is done by species name , or by site or region....in which case we can list all species in that site or region with unknown species-code.

maurolepore commented 6 years ago

Closing. This issue seems to have already captured all we need to inform our decisions.

In summary, @gonzalezeb will request species-codes to PIs, and I'll write the code as flexible as possible. Here is a draft of the logic:

User must provide a census table (with dbh).

1. User provides a codes-lookup table?
    1.1. Yes: Use it.
    1.2. No: Do we have a codes-lookup table stored in allodb?
        1.2.1. Yes: Use it.
        1.2.2. No: Throw error: "Can't lookup codes. Provide a codes-lookup table".