Quick review of functionality - time vs. space-scaling input

stschiff commented 4 years ago

So, having reviewed this, it looks all very nice! The one thing that I would like to briefly suggest a possible improvement for is how the tesselate function interprets its input. Currently, since you're pushing this directly to voro++ you don't assume any different units in the input. You implicitly assume that the user provides three dimensional data with similar units. That makes sense if you perform tessellation agnostic of specific applications. However, your cut function, the README Quickstart-example, the vignette and the paper are clearly geared towards spatio-temporal analysis. But in that case, the user needs to make a critical decision on how to compare space- with time-units. You happen to use 1km=1year in both the paper and the README example, but I think it would be worth considering requiring this to be more explicitly set by the user.

At the minimum, I think the documentation should make that implicit choice of 1-1 mapping explicit, and that's what I have tried in my edits in #2.

One idea I had was to give an additional input parameter for tesselate, which could be unit_scaling or so, and should be a vector of conversion factors. For example, in the case of the Quickstart example, because you have spatial coordinates in meters and time in years, you might want to use unit_scaling=c(0.001, 0.001, 1) as parameter, guiding your function to first multiply x and y by 1/1000 before comparing them to the z-axis. Alternatively you could have set unit_scaling=c(1,1,1000). You can even make it terser and say time_per_length=0.001 which means that here the time axis (z) is given in units of 1/1000 of the units of the length scales, or so.

I don't want to make it too complicated, and leaving things as they are is fine, as long as you make it more explicit in the docs. Generally I think explicit is better than implicit in user interfaces.

stschiff commented 4 years ago

Thinking about this more, I think one argument for allowing an additional scaling parameter input is that you may not be sure about the best scaling and may want to try different correspondence factors. Say, 1km=2years, or so. In the current implementation, the only way to achieve that is by dividing years by 2.

A complication of my suggestion of additional parameter input is that this will also affect output functions in subtle ways. For example, I guess the cut function would be sensitive to any scaling you do inside tesselate. So perhaps this is too much of a change and it's better to leave things as they are. But in any case, it's worth having a bit more explicit documentation about this, I think. I just submitted another commit to improve on the tesselate help, probably not sufficient or clear enough yet.

nevrome commented 4 years ago

I added an option for automatic rescaling in tessellate and we added documentation of this crucial issue at multiple points (function documentation, README, JOSS Paper).

nevrome / bleiglas

Quick review of functionality - time vs. space-scaling input #3