I run (and often re-run) Distance on data A LOT. So for me the detailed output, while useful for understanding one analysis, slows down comparisons across years. I am working with data from 2-4 parks per year, where we estimate density trends using spotlighting. We use the same methods on the same routes every 2-3 years. I am working my way back through 20 years of this data. I find errors or decide to look more closely at truncation or binning, etc., and I invariably need to run the model again. Then I have to extract the same 20 numbers that allow me to look at density estimates, abundance, and group size across years.
This year, I finally broke down and wrote a function to extract these numbers. As you will see, if you are interested enough to read my clunky function, the values I need are not easy to find in the output. This is one of the things that kept me from writing the function. Don't get me wrong, the Distance package is great. It does a ton of work and has really useful data in its output, serving a lot of different needs. However, writing the function made me think that there may be demand out there for something that pulls the kind of data out that I have been gathering.
I thought I would ask people whether they could use something like this.
How can this be done? There are already summary and print methods and a summarize_ds_models function that does some of this, but these are mostly formatted text outputs. I think that currently, one way to do this would be to create broom::tidy and glance methods for R Distance. If you are not familiar these are methods for grabbing, in the case of tidy, parameters that are commonly recorded or reported for an analysis. In the case of glance methods, a one row summary of the critical outputs, perhaps ones that tell you whether your model fits or not, so for us, GOF p value and maybe CV of D, A, S, and p estimates. broom also typically implements a augment method to add data to the original data table. I cannot see a use for this, but maybe someone else can.
What I need is "tidy" output with a row for each whole model output for a year and a row for each route within year. My function does what I need for now.
Let me know what you think about all this. If there is interest, maybe we can turn this into a set of "tidy" summaries.
From Pat Lorch to the list on 11Apr