ctmm-initiative / ctmmweb

Web app for analyzing animal tracking data, built upon ctmm R package
http://biology.umd.edu/movement.html
GNU General Public License v3.0
34 stars 22 forks source link

chose best unit for measurements in time, distance etc #2

Closed xhdong-umd closed 7 years ago

xhdong-umd commented 7 years ago

data summary and plot should have best unit adjusted according to dat range.

chfleming commented 7 years ago
  1. Right now everything is in SI units internally.
  2. I throw these numbers into the internal unit() function with a dimension specification and it chooses the most parsimonious units and gives its labeling and scaling information.
  3. I then have unit.class() functions that can convert the units on all of my class objects (This naming is bad, unit() should be pick_units() and unit.class() should be convert_units.class())
  4. So before plotting or printing, I run my objects/numbers through these functions and then on the plot axes or pasted to the print I include the labeling character string from unit().

Using unit structures seems much more cumbersome than what we need, because we will always be either (1) converting SI units to parsimonious units right before display or (2) sticking with SI units.

The only step that I think could be simplified would to be able to plot objects in their internal SI units and then do unit conversion on the axes. This would trade the task of converting multiple plotted objects for one axis conversion. I could not figure out how to do this in base plot without disrupting the axis formatting, but it looks possible in gg_plot with the scales package.

xhdong-umd commented 7 years ago

Based on my limited understanding now, my ggplot2 plots are based on data frame instead of object. If I change the unit before plotting, I changed the values in data frame, then I need to associate the changed value with the current unit somewhere. If I cannot use a telemetry object in plotting, maybe I need to keep the unit label in another place, which make me feel uncomfortable. Maybe it's still possible to use an object that is data.frame with some attribute slots in ggplot2, I'll look at it.

With scales it should be possible to just change the axes label without changing the value, but I also have the need of changing units in other places, like the time range summary from the time subsetting page. If I changed the internal value before, I need to get the unit label somewhere. If I used scales without changing internal value, I need to convert the value again with a different function.

I don't think the units usage is ideal for my needs either. All I really want is actually some approach to associate the value with a unit label, then some general methods can be used in plots and summaries. This can be done in object with attributes. I can also add a column of unit label in data frame, but that will have too much redundancy.

I'm not working on units problem right now. I will update here once I put more thoughts and experiments in it.

xhdong-umd commented 7 years ago

After reviewing units package and our needs in the app, I decided to just use existing methods. Turned out the scales for ggplot is easy to work with. I just pick the best unit, then feed the function to ggplot then I have the axes in correct label without need of changing internal values.

by_best_unit <- function(data, dimension, thresh = 1, concise = FALSE) {
  test <- ctmm:::unit(data, dimension, thresh = thresh, concise = concise)
  # scale to be used by `scales` package, which is reversed.
  return(list(value = data / test$scale, unit = test$name, scale = 1 / test$scale))
}

format_best_unit <- function(test_value, dimension) {
  best_unit <- by_best_unit(test_value, dimension, concise = TRUE)
  unit_format(unit = best_unit$unit, scale = best_unit$scale, digits = 2)
}

Then I used similar method on the data summary table and got the result I want.

I think Justin wanted to have a switch to turn the unit conversion on and off. So you want a checkbox in all plots, data summary tables, anywhere with unit conversion?

One thing to notice is that the raw value for time could be quite big, for example the time range years will be represented in seconds.

chfleming commented 7 years ago

In the plots it is not important to have the option for raw units. I would not implement that feature.

In the model and home-range summaries/reports this option becomes important. Raw units makes it easier for people working with many individuals (possibly of different species) to make comparisons.

xhdong-umd commented 7 years ago

I implemented units conversion for distance, time in plots and all summaries. Added switch for seconds/normalized units in the data summary table. Will add similar switches in summaries/reports when needed.