SCBI-ForestGEO / 2023census

Repository for the 2023 recensus of the SCBI ForestGEO plot
Creative Commons Attribution 4.0 International
3 stars 0 forks source link

Define tests to include in ArcGIS app vs GitHub CI #3

Closed teixeirak closed 1 year ago

teixeirak commented 1 year ago

We need to come up with lists of tests that are/can be included in the ArcGIS app vs those that should be programmed in GitHub CI system.

teixeirak commented 1 year ago

I've started a tentative list in this doc.

teixeirak commented 1 year ago

Note: There are presumably some checks that could be done in either the ArcGIS app or GitHub CI. It's preferable to have as many as feasible in the field, but the CI system would be useful for ones that are hard to code. This is something we'll want to discuss with Stuart, Sean, and others.

ValentineHerr commented 1 year ago

@jess-shue, if you need help with this, I'd be happy to look at the ArcGIS app with you and help you figure out how to code the tests. (Full disclosure: I have never seen the interface you are working with, and I don't even know the coding language needed, but I am confident I can provide some help if you don't get it elsewhere)

jess-shue commented 1 year ago

@ValentineHerr Thank you! Right now, my main issue is that I can build the app, but not with the relate that Milton has created for BCI between trees and stems - I'm missing a detail on how he did it. I will definitely reach out once I cross a few other hurdles; thanks again!

teixeirak commented 1 year ago

Here are Suzy Sine's comments about errors found using the app on BCI:

Since we were the first ones to test the app, our experience with BCI does not mean you will encounter the same problems. Following are some observations.

  1. Make sure that you include ALL the stems that you want recensused, and make sure that all of them get uploaded into the app. I am not really sure why some plants were not uploaded (Milton and David can better explain this to you). It may have to do with the coordinates not coinciding with the quadrat they were supposed to be in, or that some dbhs did not fall within the dbh range set up in the app. I believe that this "bug" may have been fixed, but I still check.
  2. The main stem appears in the main form, while the multiple stems of multiple-stemmed trees appear in a secondary form. I find that some secondary stems do not get measured. The app lets you know which trees have not been recensused yet by the color, but it does not let you know if a multiple stem was not measured.
  3. Your R script should check for diameters that increase too much (growth rate too high) or decrease too much without an explanation. The app does not check for this.
  4. At BCI, we do not include all the trees in the database in the census form. We only include trees that are alive and those that died the previous census. So we check that the tags used by new recruits have not already been used in the database by a tree that had died.
  5. Check for duplicate stem tags. The app does not check for this.
  6. I check for dbhs that are 0 (not measured), they should have a code explaining why it was not measured (dead, stem broke off, stem does not reach 1.3 m in case of multiple-stemmed trees, etc.).

@ValentineHerr , we'll want to be sure to include all of these in our GitHub CI.

teixeirak commented 1 year ago

@jess-shue , are you at the point where you could provide some dummy output data so that @ValentineHerr can start preparing the GitHub CI system?

jess-shue commented 1 year ago

@teixeirak Yes, but I think this would be a good time to touch base and go over a few things for our workflow. Would you have time next week for a demonstration? I could use your opion/thoughts on things - and really David's as well since he'll be leading things. There is still a lot I don't know/understand regarding the back-end of the data and how they're dealing with field corrections. I wasn't able to meet with Milton this week because he is under the weather.

teixeirak commented 1 year ago

Hi Jess, Valentine, and David,

I’m moving this conversation to email since David’s not on GitHub yet. If we want to do this demo before David arrives (March 9?), next week is probably the best time, as I’ll be traveling the following Wed-Sun (March 1-5). Tuesday, Wed, and Friday are pretty open for me.

K

From: jess-shue @.> Date: Friday, February 17, 2023 at 2:12 PM To: SCBI-ForestGEO/2023census @.> Cc: Teixeira, Kristina A. @.>, Mention @.> Subject: Re: [SCBI-ForestGEO/2023census] Define tests to include in ArcGIS app vs GitHub CI (Issue #3) External Email - Exercise Caution

@teixeirakhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fteixeirak&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584573769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Q2pEx0QmjQvtznliBRlcfI9nd9Tiua%2Bm4i0Fv4Rt2ME%3D&reserved=0 Yes, but I think this would be a good time to touch base and go over a few things for our workflow. Would you have time next week for a demonstration? I could use your opion/thoughts on things - and really David's as well since he'll be leading things. There is still a lot I don't know/understand regarding the back-end of the data and how they're dealing with field corrections. I wasn't able to meet with Milton this week because he is under the weather.

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSCBI-ForestGEO%2F2023census%2Fissues%2F3%23issuecomment-1435118915&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584573769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qhUe1Oz9gS8xUUAR14%2FguF4rUprOu0LnaKm9dpqYiO8%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABQPXDWVX3BWECNNAD7RQL3WX7ESHANCNFSM6AAAAAAQH4FSK4&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584730012%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PISWJ2EpeTz%2Bxi%2FoWHvbFViW7GQG6EzDOvrss%2FbP4zg%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

jess-shue commented 1 year ago

Thanks Krista, Yes, I think it would be great to meet now, and go over what I have so far. Having input from you and especially David will be important to finalizing things - and will help me put a list together of things to ask Milton about.

I have Tuesday before 1 pm open, Wednesday before 10 am, or Friday any time. Thanks, Jess


From: Kristina Anderson-Teixeira @.> Sent: Friday, February 17, 2023 2:25 PM To: SCBI-ForestGEO/2023census @.> Cc: Shue, Jessica @.>; Mention @.> Subject: Re: [SCBI-ForestGEO/2023census] Define tests to include in ArcGIS app vs GitHub CI (Issue #3)

External Email - Exercise Caution

Hi Jess, Valentine, and David,

I’m moving this conversation to email since David’s not on GitHub yet. If we want to do this demo before David arrives (March 9?), next week is probably the best time, as I’ll be traveling the following Wed-Sun (March 1-5). Tuesday, Wed, and Friday are pretty open for me.

K

From: jess-shue @.> Date: Friday, February 17, 2023 at 2:12 PM To: SCBI-ForestGEO/2023census @.> Cc: Teixeira, Kristina A. @.>, Mention @.> Subject: Re: [SCBI-ForestGEO/2023census] Define tests to include in ArcGIS app vs GitHub CI (Issue #3) External Email - Exercise Caution

@teixeirakhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fteixeirak&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584573769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Q2pEx0QmjQvtznliBRlcfI9nd9Tiua%2Bm4i0Fv4Rt2ME%3D&reserved=0 Yes, but I think this would be a good time to touch base and go over a few things for our workflow. Would you have time next week for a demonstration? I could use your opion/thoughts on things - and really David's as well since he'll be leading things. There is still a lot I don't know/understand regarding the back-end of the data and how they're dealing with field corrections. I wasn't able to meet with Milton this week because he is under the weather.

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSCBI-ForestGEO%2F2023census%2Fissues%2F3%23issuecomment-1435118915&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584573769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qhUe1Oz9gS8xUUAR14%2FguF4rUprOu0LnaKm9dpqYiO8%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABQPXDWVX3BWECNNAD7RQL3WX7ESHANCNFSM6AAAAAAQH4FSK4&data=05%7C01%7Cteixeirak%40si.edu%7Cdb6f53b6b34d479e5fc408db111aedd8%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122579584730012%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PISWJ2EpeTz%2Bxi%2FoWHvbFViW7GQG6EzDOvrss%2FbP4zg%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSCBI-ForestGEO%2F2023census%2Fissues%2F3%23issuecomment-1435133349&data=05%7C01%7Cshuej%40si.edu%7C64d707ac5a3a4413e34208db111cc0a7%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122587422765825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ksaZrzY37SZCJYZ1RrWb%2FWloWWk4Uce7qdvxln170Yg%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAI5MYOOKOP6ALKNSNIPUKXLWX7GDFANCNFSM6AAAAAAQH4FSK4&data=05%7C01%7Cshuej%40si.edu%7C64d707ac5a3a4413e34208db111cc0a7%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638122587422765825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OkDSt1FvRXdJFpIkzXUrnpst4viZnBahjipYxZylWYg%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

ValentineHerr commented 1 year ago

Coming back to Suzanne's email:

  1. Make sure that you include ALL the stems that you want recensused, and make sure that all of them get uploaded into the app. I am not really sure why some plants were not uploaded (Milton and David can better explain this to you). It may have to do with the coordinates not coinciding with the quadrat they were supposed to be in, or that some dbhs did not fall within the dbh range set up in the app. I believe that this "bug" may have been fixed, but I still check.

@jess-shue, what are the filters that you are applying to the 3rd census data to populate the 2023 form? I just want to make sure that it is a minimum filtering, so all trees get uploaded. (I understand that filtering out trees that were "DN" makes sense.)

  1. The main stem appears in the main form, while the multiple stems of multiple-stemmed trees appear in a secondary form. I find that some secondary stems do not get measured. The app lets you know which trees have not been recensused yet by the color, but it does not let you know if a multiple stem was not measured.

@jess-shue, I am not sure you can do much about this.... My script will detect stems that were missed in a quadrats, regardless of other stems of the same tree were measured. I suspect it will be most tricky when a stem is gone, and the crew won't even know they are missing it.

3. Your R script should check for diameters that increase too much (growth rate too high) or decrease too much without an explanation. The app does not check for this.

Can't remember... did you manage to get this in your app @jess-shue? I'll plan to code for it if not.

I think 4-6 is all for me to code..

jess-shue commented 1 year ago

Hi @ValentineHerr:

  1. The only 'filtering' I applied was to create the mortality category for each stem. Otherwise, all stems should be included. I didn't remove DN stems either in case of resprouting, and to avoid something not being in the app.
  2. I've arranged things slight differently for SCBI so that there is a 'tree' or main stem form, but all measurements will be recorded using the 'stem' form (even the main stem) so that all mortality data are also collected in one place. I hope the workflow will allow folks to see the list of stems for a multi-stemmed tree along with it's census_status (finished, not started, etc.). Then, on the main page they will select the 'tree' census_status (finished, not started, etc.). Not sure this is the best way to do this, but my thoughts on it for now.
  3. I have included an error message when the DBH decreases by 0.5 cm or increases by 4 cm - the 4 cm was a standard used for excess growth at Lilly Dickey Woods last summer and I believe it will be a good benchmark at SCBI. My main concern with this is that I'd like the error to distinguish between dead standing and living trees, but right now that status isn't taken into account. It is only an error message, so it won't hinder folks from collecting data. Since dead standing DBH will be collected I think this is our best option, but will still need to be double checked on the back-end.
ValentineHerr commented 1 year ago

I've started a tentative list in this doc.

@teixeirak, I don't see the file you are referring to. Also, I only see a 2018 protocol. Did you mean to update that?

teixeirak commented 1 year ago

That was intended to point here: https://github.com/SCBI-ForestGEO/2023census/blob/main/doc/methods.md. Apparently I never actually started the list!

It would be great if you could update that doc (or some other) to list the tests you code.

ValentineHerr commented 1 year ago

@jess-shue what units will the dbh be measured with? If not mm, can they be in mm in the back end, to be consistent with previous censuses? (I can code it if necessary, but I need to know what they are recorded in.)

@teixeirak, I am going to flag the following cases:

I got the numbers by looking at median of all positive (resp. negative) "naive" growths from census 1 to census 2, and census 2 to census 3:

x = (current - previous)/previous
> 1 + median(x[x>0])
[1] 1.125
> 1 + median(x[x<0])
[1] 0.9722222

let me know if that sounds good to you.

ValentineHerr commented 1 year ago

6. I check for dbhs that are 0 (not measured), they should have a code explaining why it was not measured (dead, stem broke off, stem does not reach 1.3 m in case of multiple-stemmed trees, etc.).

@teixeirak and @jess-shue, can you double check me here?

I understand I need to flag cases where current dbh is 0 but the following codes are not listed in the codes column:

The complete list of code is:

code def
CL clear list
A alternate HOM
B broken above 1.3 m
C dead above 1.3 m
F within deer exclosure
G ID to Genus uncertain
I stem irregular where measured
J stem bent
L stem leaning
main main stem
M multiple stems
S secondary stem
P stem prostrate
RT replace tag
NN new nail or wire
R resprout
X stem broken below 1.3m
TR tag removed
WR wire removed
jess-shue commented 1 year ago

@jess-shue what units will the dbh be measured with? If not mm, can they be in mm in the back end, to be consistent with previous censuses? (I can code it if necessary, but I need to know what they are recorded in.)

@teixeirak, I am going to flag the following cases:

  • previous_dbh * 0.972 > current_dbh (so allowing a 3% decrease)
  • previous_dbh * 1.125< current_dbh (so flaging when growth is greated than12.5%).

I got the numbers by looking at median of all positive (resp. negative) "naive" growths from census 1 to census 2, and census 2 to census 3:

x = (current - previous)/previous
> 1 + median(x[x>0])
[1] 1.125
> 1 + median(x[x<0])
[1] 0.9722222

let me know if that sounds good to you.

I can do mm - I had originally used mm because of the data. However, the 2018 protocol said to collect in cm. @teixeirak What would you prefer?

teixeirak commented 1 year ago

@teixeirak, I am going to flag the following cases:

* `previous_dbh * 0.972 > current_dbh` (so allowing a 3% decrease)

* `previous_dbh * 1.125< current_dbh` (so flaging when growth is greated than12.5%).

I got the numbers by looking at median of all positive (resp. negative) "naive" growths from census 1 to census 2, and census 2 to census 3:

x = (current - previous)/previous
> 1 + median(x[x>0])
[1] 1.125
> 1 + median(x[x<0])
[1] 0.9722222

let me know if that sounds good to you.

Sorry for the slow response. You're saying we'd flag as an error anything above/below the median % growth/shrinkage? Wouldn't that be flagging about 1/2 the measurements? I think there's something I'm not understanding.

teixeirak commented 1 year ago

I can do mm - I had originally used mm because of the data. However, the 2018 protocol said to collect in cm. @teixeirak What would you prefer?

I don't have a strong preference. (As you say, conversion is easy.) @jess-shue , do you find that one or the other works better for the census crew?

teixeirak commented 1 year ago
  1. I check for dbhs that are 0 (not measured), they should have a code explaining why it was not measured (dead, stem broke off, stem does not reach 1.3 m in case of multiple-stemmed trees, etc.).

@teixeirak and @jess-shue, can you double check me here?

I understand I need to flag cases where current dbh is 0 but the following codes are not listed in the codes column:

* C: dead above 1.3m

* X: stem broken below 1.3 m.

The complete list of code is: code def CL clear list A alternate HOM B broken above 1.3 m C dead above 1.3 m F within deer exclosure G ID to Genus uncertain I stem irregular where measured J stem bent L stem leaning main main stem M multiple stems S secondary stem P stem prostrate RT replace tag NN new nail or wire R resprout X stem broken below 1.3m TR tag removed WR wire removed

Sorry, I'm not sure what you're asking here. (Maybe it's just too early in the AM and my brain is fried from conference!)

ValentineHerr commented 1 year ago

Sorry for the slow response. You're saying we'd flag as an error anything above/below the median % growth/shrinkage? Wouldn't that be flagging about 1/2 the measurements? I think there's something I'm not understanding.

Oh, right sorry, I had meant to change median to an extreme quantile.....

Thinking of using this...

1 + quantile(x[x>0], 0.97) 97% 1.92 1 + quantile(x[x<0], 0.03) 3% 0.757355

We can adjust the numbers when we get a sense of how many measurements if will flag.

ValentineHerr commented 1 year ago

I can do mm - I had originally used mm because of the data. However, the 2018 protocol said to collect in cm. @teixeirak What would you prefer?

I don't have a strong preference. (As you say, conversion is easy.) @jess-shue , do you find that one or the other works better for the census crew?

Maybe David can decide. But it would be great if the data was in mm in the backend regardless.

jess-shue commented 1 year ago

I can do mm - I had originally used mm because of the data. However, the 2018 protocol said to collect in cm. @teixeirak What would you prefer?

I don't have a strong preference. (As you say, conversion is easy.) @jess-shue , do you find that one or the other works better for the census crew?

Maybe David can decide. But it would be great if the data was in mm in the backend regardless.

This is something built into the app and I really don't want to have to change this next week while the field crew is gearing up. I spoke to Krista via text - I have it recorded in cm by the field crew with a hidden calculation field to convert it to mm - so you'd have both fields available.

OK?

ValentineHerr commented 1 year ago

ok perfect!

teixeirak commented 1 year ago

We can adjust the numbers when we get a sense of how many measurements if will flag.

That's still flagging ~6% of measurements as potentially erroneous, which would be a lot of records to revisit!

@jess-shue , is there a control in the ArcGIS app?

ValentineHerr commented 1 year ago

The list of checks is located in this file. The file is used by the script (there is R code in it).

All the reports are generated in [this folder (https://github.com/SCBI-ForestGEO/2023census/tree/main/QAQC_reports)

Now, if the GitHub action is not passing, it means that the action part or the script is broken, and not that there are errors. I think it makes more sense, so we know when we need to fix the script (the person who pushed something receives an email saying the action failed) vs the data (the dashboard indicates if there are errors and warnings).

jess-shue commented 1 year ago
  1. I have included an error message when the DBH decreases by 0.5 cm or increases by 4 cm - the 4 cm was a standard used for excess growth at Lilly Dickey Woods last summer in FFFs, using that as a reference. My main concern with this is that I'd like the error to distinguish between dead standing and living trees, but right now that status isn't taken into account. It is only an error message, so it won't hinder folks from collecting data. Since dead standing DBH will be collected I think this is our best option, but will still need to be double checked on the back-end.