seanmcm / RDendrom

R package for analysis, diagnostics, and presentation of intra-annual tree growth time series
1 stars 0 forks source link

Questions about the package #1

Closed mcgregorian1 closed 5 years ago

mcgregorian1 commented 5 years ago

Hi @seanmcm

I figured putting something here would be easier than sending you a long email. These are just some notes I'm finding as I'm going through your vignette / trying to use your package.

General

  1. When you’re describing the columns needed for the functions, you reference “BAND_NO” a couple times despite your script data having the column “BAND_NUM.”

get.optimized.dendro

smaller notes

seanmcm commented 5 years ago

Thanks. I'm going to work through this. The most important thing is that the package won't load functions without help files. I found that out yesterday and finished the help files last night. Let me go through your notes, re-load, test, and then ping you that it's ready. Thanks a lot!!

seanmcm commented 5 years ago

OK! I have it working now (just code from vignette). Let me know how it goes.

Also... Can't find typo. could you give a bit more detail about the "that"? Yes, the links are not complete. the no.neg.growth kind of has to be on now. This does not eliminate any measurements smaller than measurements made before, but instead skips the curve fitting when the tree shows a negative growth slope through the year. If, for example, the tree has died and is just shrinking gradually, it will detect this with a linear model and not try and fit a LG5 curve with a positive slope (where the asymptote K has to be larger than L.

Yes, running code fills in the DBH and makes a DBH_TRUE. DBH is made from GAP_WIDTH and ORG_DBH. DBH_TRUE is made from correcting for the chord (which is what the calipers measure, and not the arc of the tree bole). If you need this changed, let me know. I can customize easily.

mcgregorian1 commented 5 years ago

Ok, sounds good. I'll try running the code.

Regarding the no.neg.growth, ok that makes sense.

The DBH vs DBH_TRUE is fine! I asked because I was getting a bit confused from the vignette, as you mention both "TRUE_DBH" and "DBH_TRUE," and I wasn't sure that DBH_TRUE was actually a separate column made.

My bad on that. The typo is here: image

mcgregorian1 commented 5 years ago

Hi Sean, Apologies for the length of this. Let me know if you'd like me to elaborate on anything.

1. general

2. running code to get output files

  1. In your vignette, you say that the only necessary columns for running the code are: " What must be included is the ‘TREE_ID’, ‘BAND_NO’, ‘YEAR’, ‘ORG_DBH’, ‘GAP_WIDTH’, and the four STATUS columns." I did this and ran the code, and I got the following error:

    > get.optimized.dendro(test_intra, OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output")
    |=============================================================================================| 100%
    Error in data.frame(SITE = ts.data$SITE[1], YEAR = ts.data$YEAR[1], TREE_ID = ts.data$TREE_ID[1],  : 
    arguments imply differing number of rows: 0, 1, 9

    So, I tried adding a SITE column, and then the code worked. I'm not sure if this was intended, but I wanted to flag it so you can decide if you want to add this to the necessary columns list.

  2. This is minor, but reading your vignette the first time, I was under the impression that my original output files should have the names "Dendro.split", "Dendro.complete", and "Dendro.tree". I didn't realize for a bit that when you introduce these names you're referring to their names once they're loaded back into R. image

To follow on that, compare your vignette code with how I ran it. I think mine might be simpler? I tried running the code with the INPUT.dendro$DATA_SET[1] aspect (both with my data and with INPUT.dendro) but it returned an empty character vector. I removed it and ran the code, which then worked. image

> param.table.name = "Param_table.csv"
> Dendro.data.name = "Dendro_data.Rdata"
> Dendro.split.name = "Dendro_data_split.Rdata"
> OUTPUT <- "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output"
> 
> param.table <- read.csv(file = paste(OUTPUT.folder, param.table.name, sep = "/"))
> load(file = paste(OUTPUT.folder, Dendro.data.name, sep = "/")) # loads Dendro.complete
> load(file = paste(OUTPUT.folder, Dendro.split.name, sep = "/")) #loads Dendro.split
> load(file = paste(OUTPUT.folder, "Dendro_Tree.Rdata", sep = "/")) #loads Dendro.tree
> 
> get.extra.metrics(param.table, Dendro.split, OUTPUT.folder = OUTPUT)
  |=============================================================================================| 100%

3. results

I checked the results for my first tree, and it says after 10 years based on band measurements the DBH grew from 611.7743 to 615.0820. I compared this to the raw data we have for that tree, and when I do a basic plot of each side-by-side, I get the following discrepancy in measurements:

image

This is possibly due to SCBI-specific discrepancy since our "ORG_DBH" was taken in 2008 but we didn't get dendrobands until 2010 (and new dbh measurements were not taken at that time). That being said, I tried your code again with another tree, and I found the same.

This time, I noticed that in my original ORG_DBH column from the raw data, the dbh measurement does change when we put in the 2013 census dbh (hence the jump in my graph from 2013-2014). Yet when I looked at ORG_DBH from the output df "Dendro.complete," I noticed this new dbh measurement was replaced by the smaller measurement until much later in the timeseries when NEW_BAND =1. Our problem at SCBI is that often from previous years we have NEW_BAND=1 when there is no new ORG_DBH.

4. plotting

  1. I'm confused by both the plot.dendro.ts and plot.dendro.tree function in that according to the help file, I don't actually write "plot.dendro.ts" but instead say "plot" (this might just be my unfamiliarity with R). I did try running the the function, but I kept getting errors.

    > plot(test_intra) #test_intra is my ts.data argument
    Error in plot.new() : figure margins too large
    > plot(Dendro.split)
    Error in xy.coords(x, y, xlabel, ylabel, log) : 
    'x' is a list, but does not have components 'x' and 'y'
    > plot(Dendro.complete)
    Error in plot.new() : figure margins too large
  2. I guess my main question for here is, how did you plot the images you showed us in the email we looked at last week? Did you specify "params" as something other than NULL?

seanmcm commented 5 years ago

Oops! Units in mm vs cm! I added a units argument (that defaults to cm) in the get.optimized.dendro() function and the gap2dbh() function. Updated help files for that. Please check and see if this fixed it.

mcgregorian1 commented 5 years ago

Hmm. That seemed to do the trick for the first tree: image

And mostly for the second one! image

To be extra robust, I tried testing for a tree we have that not only had multiple NA measurements but also had a band changed at a couple points. Our graph looks like the one below. However, I end up getting an error I'm not sure how to fix:

> get.optimized.dendro(test_intra, OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output")
  |===============================                                                              |  33%Error in optimize(.difdendro, interval = c(0, dbh1 + 2 * gw2), gw2 = gw2,  : 
  invalid 'xmin' value

image

seanmcm commented 5 years ago

Ok. In field. Will look at this afternoon

Sent from my iPhone

On Feb 27, 2019, at 11:49, Ian McGregor notifications@github.com wrote:

Hmm. That seemed to do the trick for the first tree:

And mostly for the second one!

To be extra robust, I tried testing for a tree we have that not only had multiple NA measurements but also had a band changed at a couple points. Our graph looks like the one below. However, I end up getting an error I'm not sure how to fix:

get.optimized.dendro(test_intra, OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output") |=============================== | 33%Error in optimize(.difdendro, interval = c(0, dbh1 + 2 * gw2), gw2 = gw2, : invalid 'xmin' value

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mcgregorian1 commented 5 years ago

Hi @seanmcm

Can you send me the updated vignette as an html file please? I can see your updated script on github but I don't have access to the repo

seanmcm commented 5 years ago

Should be an html in repo online

Sent from my iPhone

On Mar 8, 2019, at 14:16, Ian McGregor notifications@github.com wrote:

Hi @seanmcm

Can you send me the updated vignette as an html file please? I can see your updated script on github but I don't have access to the repo

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mcgregorian1 commented 5 years ago

Yep, my bad I got it

mcgregorian1 commented 5 years ago

Hi @seanmcm

It seems I'm getting a different error in running get.optimized.dendro. I have the necessary column names but I'm getting the following message:

> get.optimized.dendro(test_intra, OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output")
  |============================================================================| 100%
Error in optimize(.difdendro, interval = c(0, dbh1 + 2 * gw2), gw2 = gw2,  : 
  invalid 'xmin' value

I tried a different tree (with full data, no breaks) and I keep getting the following error. I tried changing the columns to be exactly the same in the same order, made sure that nothing was accidentally a string, and even tried removing two columns from INPUT.data to see if that affected it (it still ran). For some reason, though, my data won't run.

> get.optimized.dendro(test_intra, units="cm", OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output")
  |=======================================================================================| 100%
Error in ind.data.band[[b]] : subscript out of bounds
seanmcm commented 5 years ago

Does the sample data from the vignette work? If so can you send me those two trees so that I can debug. Thanks much Sean

Sent from my iPhone

On Mar 8, 2019, at 15:31, Ian McGregor notifications@github.com wrote:

Hi @seanmcm

It seems I'm getting a different error in running get.optimized.dendro. I have the necessary column names but I'm getting the following message:

get.optimized.dendro(test_intra, OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output") |============================================================================| 100% Error in optimize(.difdendro, interval = c(0, dbh1 + 2 * gw2), gw2 = gw2, : invalid 'xmin' value I tried a different tree (with full data, no breaks) and I keep getting the following error. I tried changing the columns to be exactly the same in the same order, made sure that nothing was accidentally a string, and even tried removing two columns from INPUT.data to see if that affected it (it still ran). For some reason, though, my data won't run.

get.optimized.dendro(test_intra, units="cm", OUTPUT.folder = "C:/Users/mcgregori/Dropbox (Smithsonian)/Github_Ian/Dendrobands/results/McMahon_code_output") |=======================================================================================| 100% Error in ind.data.band[[b]] : subscript out of bounds — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mcgregorian1 commented 5 years ago

Yep the sample data works.

Here are the files: tree_7444 and tree_10045

seanmcm commented 5 years ago

Hi Ian... success, I think. Note I recompiled the Vignette, and also purled the R from it, so you can just run that. There was an issue where I assumed that a new band overlapped with the previous band in a year, because that was the case in the sample tree I used to design the band handoff!

Also, however, I identified two requirements for running the opitmize code: 1) BAND_NUM must be integer sequence from 1 to no.bands (i.e., 1, 1, 1, ..., 1, 2, 2, ..., 2, 2, 3, ...). 2) No NAs in the GAP_WIDTH.

I included three lines of code in the Vignette and the R file purled from it that fixes these before sending them for optimization. please check that this works.

do you think I should put them in the optimize code? behind the scenes?

Or, should I write some basic 'check' code to id these problems?

thanks again for all your help. Please check that this is working now and get back.

mcgregorian1 commented 5 years ago

Hi Sean,

Code works for both trees I linked above! I see what you did and that makes sense. The only thing I did to preserve our own data was to have the band.index be done on our dendroID column (unique identifiers for the bands) and then use that index when creating the BAN_NUM column.

As for where to put those lines of code...I'm not sure. The way my mind works, I'd rather be able to see this separately and fix it if I need to (like I did above where I specify the column to take a band index from). Or, at least, be able to read a full description of what's happened (as you've explained it) if the code is run automatically.

The check code may help in that regard, but having the fix code ready (e.g. your band.index and subset by complete.cases) is very helpful because then people can just instantly plug it in and run. I have seen other packages where the error message is like "Did you mean to include these numbers?" and then I'm left trying to figure out how to fix this on my own.

All that being said, I think including the code as you have it in the vignette is good, though you may inevitably get messages from people asking why this isn't automatic.

Other questions for code

  1. I tried running the make.dendro.plot.ts code below (using output from Tree_7444) and I received an error. Upon checking what was Dendro.split[[3]], I noticed that Dendro.split is a list of 7, and the [[3]] was NULL; it appears it coerced the years 2015 and 2016 to NULL, but unsure why.
    make.dendro.plot.ts(ts.data = Dendro.split[[3]], params = param.table[3, ], day = seq(365))
    Error in plot.window(...) : need finite 'ylim' values
    • this is interesting because I ran the same code with the output for Tree_10045. In this case, Dendro.split is a list of 9, and [[3]] is a full dataframe for the correct year. [[1]] is NULL here, because for our 2010 data we only had one measurement. Regardless, Tree_10045 gives me a graph image

most helpful

  1. I also tried your other graphing and it worked, but something was off with the curves for Tree_7444 (code below and graph). However, Tree_10045 worked out well (you can definitely see in this graph how the measurements decline from the fall to the spring, like we talked about before).
    make.dendro.plot.tree(Dendro.ind = Dendro.tree[[1]], param.tab = subset(param.table, TREE_ID == Dendro.tree[[1]]$TREE_ID[1]))

    image image

Other things

  1. In the vignette when you describe the get.optimized.dendro function and its arguments, you leave out the "units=cm" argument (but rest assured it does appear in the help file in R).
seanmcm commented 5 years ago

First, thanks so much. I'm going to close this, but please open a new issue next ... well, issue, as your input is very important, but I'm trying to partition the progress of this a bit.

These comments are so helpful, from context to details.

There is a bug in make.dendro.plot.tree() that, when there are insufficient data or negative growth to fit a curve, the curve of the previous year is fit. What I'm pleased with is that the bug is ignored in the continued plotting of the actual (estimated) diameter values. I will remove those extraneous fits.

I need to work on help files in a major way. No time now, but soon will do. I essentially made the minumum legitimate package structure (except for the Vignette). Going through and making examples, giving details, and output will be a nice addition, which I see doing in about a month (grant writing needs will not make that a good use of time). We can revisit later.

Finally, I would like to make this a broader package for Dendrometer bands by including ways to handle: 1) dendrometer bands with few measurements (i.e., biannual growth estimates). 2) tropical data that moves with different periodicities and may need splines at times and not the LG5 function 3) automated dendrometer data 4) uncertainty in measurements, errors, spring changes, etc. 5) incorporating climate data into the package.

mcgregorian1 commented 5 years ago

Ok! Glad to help. I'll look out for an email from you when you next have updates