USFS-PNW / Fia_Biosum_Scripts

Contains scripts to support FIA Biosum application including R and KCP scripts
Other
1 stars 2 forks source link

Fix ChipTrees equation to have correct conversion from percent to TPA/volume and fix other chipping equations #5

Open carlinstarrs opened 6 years ago

carlinstarrs commented 6 years ago

Note this issue is a combination of two separate issues related to chipping:

A. Calculation of the chipTrees variable B. Chip equations in opcost_ref used to calculate machine hours per acre for chipping

https://github.com/USFS-PNW/Fia_Biosum_Scripts/blob/b06df0618945a1357088f209a8a9a7221b2f8581/OPCOST/Opcost_10_0.R#L54

carlinstarrs commented 6 years ago

chipTrees is currently calculated by taking ChipPct_Cat values in percent and summing them with Chip.tree.per.acre values. The ChipPct_Cat values need to be converted from percent to TPA before they can be summed.

jsfried commented 6 years ago

Below is an exchange between Carlin and Jeremy, captured here now.

Carlin said:

result is the result of the equation from opcost_ref. So for hansillChip, the trail would be as follows:

2.32 + (-0.42 1.79)+(1.83 dbh.cm) = {result} in seconds/chip

opcost_units translates this to:

(chipTrees {result})/60/60, aka (chipTrees ( 2.32 + (-0.42 1.79)+(1.83 dbh.cm) ))

Here's exactly what R comes up with for hansillChip (in case I got my parenthesis wrong above):

expression(with(data.treated,(chipTrees (2.32 + (-0.42 1.79)+(1.83 * dbh.cm)))/60/60))

The expression(with(data.treated, part just points R to the original opcost_input dataset limited to stands that will be treated.

The column is called TPA_Conversion because that's what the OpCost coder called the section where these equations were (line 457 of 9_1.R, I believe). It needs to be changed (See issue https://github.com/USFS-PNW/Fia_Biosum_Scripts/issues/21)

The dbh.cm parameter is calculated from the twitchVol, which is the average ft3/tree for all size classes converted to centimeters:

2.54 * (sqrt((m$twitchVol + 8.4166)/.2679))

We can easily turn off equations even now by just not including them in that "Analysis" column. I have an issue for changing the functionality of that to a clearer on-off switch as we discussed but I put it at medium priority since it's not technically needed to run the tethered systems analysis.

It may be best to put more of this kind of dialogue in the relevant issue in the future, if possible, since that way we can easily see going back the logic of how these things were addressed.

WHICH WAS A REPLY TO:

On Mon, Apr 9, 2018 at 4:46 PM, Fried, Jeremy - FS jsfried@fs.fed.us wrote: Chipping is complicated, but important to account for. Lacking the hansill publication is problematic as I can’t read it to gain any insights. Thinking about what is going on, we would want to account for whole tree chipping of chip trees. The Hansill eqn seems to be set up for that, using average diameter (of ALL non-brush cut trees) and calculating seconds per tree. BUT NOT ALL TREES ARE CHIPPED! So the DBH calculated off of twitchVol is going to be a big overestimate of chipping costs. Large trees generate only tops and limbs. All are smaller than the diameter of the tree they come from. But there are many of them. Perhaps this has something to do with all the fudge factors in the chipping section. My preliminary hunch is that we are going to best off using only the bolding eqn, which keys off of volume independent of diameter. We need to make sure that the volume sent to that equation is the sum of

Chip tree TPA Chip Tree Avg Vol and Small log TPA Small log Avg Vol the harvest system appropriate Small.log.trees.ChipPctCatXXX and Large log TPA Large log Avg Vol * the harvest system appropriate Large.log.trees.ChipPctCatXXX and

I posted the harvest_methods table in the OpCost Github space and it contains the wrong harvest category for tethered—it should be category 1.

Can you make it so (implement Bolding only)? Anything that makes use of a dbh dependent equation is going to be a big fudge. At lease for the CEC analysis, this is the way to go. Longer term we may peruse the lit for other studies and insights about chipping cost that are potentially more refined, but for now, let’s just use the volume based bolding eqn. Sorry to have not put this in git hub—but you can paste it in there.

jsfried commented 6 years ago

Thanks for the explanation. Yes, being able to turn off all but the Bolding chipping equation for now will be a big plus. If one is going to use dbh based equations for chipping cost, it is important to understand what is being chipped--something that differs by stand and harvest method. With log length systems (anything other than whole tree), diameter based equations are likely useful because only boles are getting chipped (albeit having been first separated into logs) and these are either trees too small to make into merchantable logs (so have at most one "log" to chip) or they are larger trees of noncommercial species such as evergreen hardwoods, which can be quite large. You will see in the input data that the Chip percent variables for log length systems tend to be zero or really small most of the time (since most stands in the regions we have been working on don't have much if any hardwood harvest). Where whole tree harvest occurs, the chip percents are much greater because of course all the tops and limbs are being brought to the landing (attached to their boles) and chipped (after delimbing), along with the too small trees and hardwoods. Any pretense at understanding the diameter distribution of these materials seems to me to be just that-- pretense-- unless there is some literature accessible to us that proves otherwise.

jsfried commented 6 years ago

At the risk of burdening one thread with too many thoughts (well, I've already done that), here's an attempt to distill the questions. Accounting for chipping cost is pretty important as it is at the nexus of decisions about what harvest system is most economical and the tradeoffs between treatment cost (and recovery of energy wood and its carbon implications) and amounts of wood left in the forest to decay (and emit C and potentially elevate fuel loading significantly, perhaps leading to more intense fires with even greater C emissions). So, it is important to develop documented and defensible logic around the algorithms and assumptions concerning this issue.

  1. The code Carlin points to at the start of this thread is almost certainly incorrect as it is not logical, but before we can devise a fix, we need to better understand how it is intended to be used given the complexity of different chipping arrangements (e.g., whole tree vs log length systems delivering different materials to be chipped at the landing) and the way diameter is calculated (from all twitch volume from all size classes of trees). This will likely require a call to dissect and resolve.
  2. The number of chipping equations supported in OpCost is small- two of the three are by the same author and only slightly different but hugely different when considering the conversion to time per hour, possibly due to a coding error. Getting to a correct Hansill formulation will be helpful, but we will even then have only two equations.
  3. FRCS had several chipping equations; none of those appear to be used in OpCost, Why? FRCS had formulations for log length (using Morbark chippers) and whole tree chipping (using flail chippers) which could prove helpful to understanding how to account better for chipping cost. Did Rob or the OpCost coder invest the time to understand that logic and would it be useful to do so? Why were the Hansill and Bolding equations selected over those in OpCost? Bolding is circa 2005 and I don't know Hansill's vintage because we have no reference for it.
  4. In the interim, we have only a single volume dependent equation (Bolding) that we may be able to use reliably for all kinds of harvests. I would be far more comfortable if there were more than one though,
jsfried commented 6 years ago

This is from an email response to Rob, recorded here to promote transparency. I have some questions as to how to interpret the Harrill article (https://www.hindawi.com/journals/ijfr/2012/893079/) in relation to the representation of the model(s) in the OpCost code:

Chip 2.32 + (-0.42 1.79)+(1.83 dbh.cm) sec/chip dbh.cm
Chip2 2.4 + (-0.32 1.4)+(1.3 dbh.cm) min/chip dbh.cm

harrill

The article describes “Elemental time-motion data were recorded by a stop watch for each machine’s cycle used in the harvesting system [12].” I am unclear as to what is meant by a machine’s cycle. And it is therefore not obvious to me how the regression model equation should be used. What is meant by number of trees? Is that trees per acre or trees processed per machine cycle (whatever that is)?

The range for number of trees is 1-6. Is that per acre? If so, it seems rather small. Or is it per minute?

Evidently, some assumptions are made in the equation, but they are not transparent to me.

The equation labeled Chip above (and hansillChip below), as coded in OpCost, is converted to hours of machine time by dividing by 60 twice:

ifelse(dbh.cm(m)<76, ((chipTrees(m)*hansillChip(m))/60/60), NA)

yet the table caption above indicates that the times are in minutes (which suggests it should be divided by 60 only once). The equation labeled chip2 above (and hansillChip2 below) divides by 60 once:

ifelse(dbh.cm(m)<76, ((chipTrees(m)*hansillChip2(m))/60), NA)

but later in the code, as the average chip time is calculated, hansillChip2 is raised to the 0.8 power: chipDF<-function(m){ data.frame( hansillChipTime(m), hansillChip2Time(m)^.8, boldingChipTime(m)) } so it is hard to know what the intent was. Moreover, both hansillChip equations seem to feed the calculated average. On a sample dataset, they give very different estimates (see attached). Does this suggest to you that perhaps the table caption in the publication was incorrect (that the units should be minutes, not seconds)? Or that the problem lies in the other component of chip calculations chipTrees(m), given that this is apparently calculated as:

The mean percent of volume in small log trees that needs to be chipped (which would be 100% for non-commercial species and something on the order of 30-40% for conifers, representing the tops and limbs under a whole tree system, so might average to something like 45-50) Plus The mean percent of volume in large log trees that needs to be chipped (which would be 100% for non-commercial species and something on the order of 0-40% for conifers, depending on exact system and slope, representing the tops and limbs under a whole tree system that might or might not bring in limbs of manually felled trees, so might average to something like 30-40) Plus The chip trees per acre.

The number will of course differ by system because the percents differ by system, but percents should never be added to tpa, we think. In our data, we average (over 1400 stands) 55 tpa of chip trees (if we exclude the 15 highest outliers which exceed 300 tpa removed).

It is not clear to me how to judge what assumptions and equations are most defensible here.

I will be adding this response to the Github space to record the conversation, and you can respond there as soon as you make an account and I add you as a team member.

This chipping stuff is confusing but important.

jsfried commented 6 years ago

Meant to paste this eqn compare graph that Carlin made, chip

to the previous post

carlinstarrs commented 6 years ago

@jsfried Just an FYI, I updated this issue a bit to reflect the two separate chipping issues instead of pulling it into two separate issues. Both will need to be resolved simultaneously since chipTrees is used for unit conversion for the machine equations.

RobKeefe commented 6 years ago

I assume that in the earlier code dividing by 60 twice serve to scale from seconds to minutes, and then to hours, but there may have been a mistake made in the other instance. I glanced at the publication and don't have any reason to believe that the table was labeled incorrectly. So, more likely the developer at the time made a mistake when editing. My recollection is that the chipping piece has been revised numerous times over the many OpCost iterations. For a useful strategy moving forward, I recommend using the Bolding equation if you feel confident in that, and then add others only if they make sense to you. It's most important that you two understand and feel confident in the details (hence the remodeling! :) )

Jeremy wrote: "I am unclear as to what is meant by a machine’s cycle. And it is therefore not obvious to me how the regression model equation should be used. What is meant by number of trees? Is that trees per acre or trees processed per machine cycle (whatever that is)?"

Many of the regression equations you will encounter on this journey predict the productive cycle element time for different kinds of logging equipment. The most commonly used research method in forest operations / forest engineering historically are time-and-motion studies (aka elemental time analysis) similar to the one described in this paper. The productive cycle is series of elements that make up one unit of (non-delay) work. For example, for a grapple skidder, the productive cycle might include driving to the woods unloaded, backing up to a bunched pile of trees, closing the grapple on that bunch, driving back to the landing loaded, and releasing the trees at the landing. Each of those steps is an element of the cycle. We typically have timed those elements (and the complete cycle) using stopwatches (or used video, GPS, etc.). We then can say something about which elements take the most time. If we measure some characteristics of the wood being harvested (e.g. stem diameter, etc.) and the site (e.g. yarding distance), we can develop a regression equation to describe what factors (e.g. stem diameter, yarding distance, as independent variables) may affect the productive cycle time (the dependent). In this example, the productive cycle time is the total time for the skidder to drive out, get the wood, and come back. (Once). The number of trees in this example would be the number of trees per cycle. So, did the skidder have one large tree? Six small trees? Etc.

jsfried commented 6 years ago

Thanks for the explication of cycle time. That is helpful.

Please look again at the article table and the equation implementation in OpCost. The table does say minutes, but in the equation it is divided by 60 twice (as if it were seconds) and generates output that comports with Bolding. When the OpCost coder uses a slightly different equation (Chip2) for Hansill and divides by just 60 (consistent with the table caption of minutes), he produces output that is averaging 80 hours per acre of chipping-- just not passing the sniff test.

jsfried commented 6 years ago

@carlinstarrs @RobKeefe Rob implies that we may want to go with the Bolding equation alone “if we are comfortable with it”. I have no basis for evaluating comfort; however, Bolding’s dependence on volume as the independent variable is attractive given the complexity of chipping arrangements and how these differ by harvest system. For example, some chip tops and limbs of some trees but not others (based on tree size and species), and these decision points are all folded into our pre-calculated (in BioSum) ChipPct categories (which differ by system and tree size). It could/would be mind-numbingly challenging to craft a system deploying diameter based equations for chipping when in some cases, whole trees are being chipped and in others, only limbs. Not a muck I am eager to descend into. As of now (for our interim, tethered system version), the inputs to Bolding are calculated in OpCost as: m$ChipFeedstockWeight <-

chip tree portion

(m$Chip.tree.per.acre (m$Chip.trees.MerchAsPctOfTotal/100) m$Chip.trees.average.volume.ft3. * m$CHIPS.Average.Density..lbs.ft3.) +

small log tree portion

(m$Small.log.trees.per.acre (m$Small.log.trees.ChipPct_Cat1_3/100) m$Small.log.trees.average.volume.ft3. * m$Small.log.trees.average.density.lbs.ft3.) +

large log tree portion

(m$Large.log.trees.per.acre (m$Large.log.trees.ChipPct_Cat1_3_4/100) m$Large.log.trees.average.vol.ft3. * m$Large.log.trees.average.density.lbs.ft3.)

This is correct for tethered systems and any other category 3 system (looking at you, Cable Manual Log and Helicopter Manual WT), but not for the others. What varies per system in terms of the weight of material to be chipped is 1) whether the factor /100 should be applied to the chip tree volume (as it is above for tethered systems, and should be also for Ground-Based CTL, Ground-Based Manual Log, Cable CTL, Helicopter CTL, Helicopter Manual WT and Cable Manual Log) or a factor of 1 (when the entirety of chip trees get chipped, as in all other systems); 2) whether to use Small log trees ChipPct_Cat1_3, Small log trees ChipPct_Cat2_4 or Small log trees ChipPct_Cat5 as the factor in the small log tree portion of the above equation; 3) whether to use Large log trees ChipPct_Cat1_3_4, Large log trees ChipPct_Cat2 or Large log trees ChipPct_Cat5 as the factor in the large log tree portion of the above equation.

If we reference these three factors as CT_ChipWtFactor, SLT_ChipWtFactor and LLT_ChipWtFactor, we can populate them from the following table and recode this as follows:

m$ChipFeedstockWeight <-

chip tree portion

(m$Chip.tree.per.acre (CT_ChipWtFactor /100) m$Chip.trees.average.volume.ft3. * m$CHIPS.Average.Density..lbs.ft3.) +

small log tree portion

(m$Small.log.trees.per.acre (SLT_ChipWtFactor /100) m$Small.log.trees.average.volume.ft3. * m$Small.log.trees.average.density.lbs.ft3.) +

large log tree portion

(m$Large.log.trees.per.acre (LLT_ChipWtFactor /100) m$Large.log.trees.average.vol.ft3. * m$Large.log.trees.average.density.lbs.ft3.)

table_graphic processor_permutations_20180411.xlsx

I am also attaching this info as a supplement to the harvest_methods table (first tab in this workbook) to aid in automation. I think that OpCost may only have a harvest system name to work with—it would certainly be easier to program to harvest system MethodID. Perhaps we should add this table to the reference database?

jsfried commented 6 years ago

@RobKeefe - Bolding implies a processing rate for the chipper of 20.24 tons per PMH. Are the per hour machine costs that were developed for OpCost in $ per PMH or SMH? If the latter, what adjustments are needed to get from the PMH calculated via the Bolding equation and SMH?