MoBiodiv / mobr

Tools for analyzing changes in diversity across scales
Other
23 stars 18 forks source link

improve make_mob_in() function when spatial coordinates not available #228

Closed dmcglinn closed 5 years ago

dmcglinn commented 5 years ago

If spatial coordinates are not explicitly specified the function make_mob_in looks for two columns in the plot_attr object called x and y as the spatial coordinates. This is not good practice and can lead to bugs easily. We need to code this so that no spatial coordinates can be provided.

caroliver commented 5 years ago
dmcglinn commented 5 years ago

This is awesome thanks @caroliver !

caroliver commented 5 years ago

@dmcglinn I am working on finishing up this task and was wondering if it would be better to initially set the coordinate parameter to true meaning that coordinates are provided, or false (they are not provided).

Also, coordinate names are currently statically set as 'x' and 'y'. Would it be helpful to allow a user to specify the names of their coordinate columns? A potential solution to this would be the addition of two more parameters, one for the 'x' column and one for the 'y' column. If so, what would be the best names for these parameters? (i.e.: a 'long' value parameter initially set to 'x' and a 'lat' parameter initially set to 'y' to inherently keep with the current naming convention)

Thanks!

dmcglinn commented 5 years ago

I do think allowing the user to specify the spatial column ids is the best practice. So the argument coord_names allows the user to specify their own column names so I don't think it is necessarily correct to call them static. Probably the best defensive programming solution here to keep bugs from popping up in the downstream analysis is just to set coord_names = NULL as the default argument. If its NULL then don't perform the aggregation test (i.e., set out$tests$agg = FALSE). If names are provided go and look for them. Here let's allow the user to specify either one or two column names for the coordinates. For example if they want to specify change through time they would only provide a single column name.

caroliver commented 5 years ago

@dmcglinn Ok I can get to work on that! And my apologies, by static I more meant that in the documentation for plot_attr it states:

"If spatial coordinates are provided, the column(s) has to have names "x" and/or "y"."

Would it be best to change the parameter documentation to state that column name(s) may be provided within the coord_names variable? And then potentially update the coord_names parameter documentation with an example of correct formatting since the parameter will now default to NULL, i.e. no provided example?

caroliver commented 5 years ago

@dmcglinn Is there a sample dataset for where there is only one column for the coordinates? Or is there a way to use the current datasets to test when there is only one column name provided in the parameters?

dmcglinn commented 5 years ago

If you only supply one named coordinate column would that do it?

On Mar 17, 2019, at 10:01 AM, Caroline Oliver notifications@github.com wrote:

@dmcglinn Is there a sample dataset for where there is only one column for the coordinates? Or is there a way to use the current datasets to test when there is only one column name provided in the parameters?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

dmcglinn commented 5 years ago

Just to be clear, supplied to the make_mob_in function

caroliver commented 5 years ago

@dmcglinn Yes for the make_mob_in function! And which coordination would be best? For each of the sample data sets there is an x and y column. Should I supply column x or y when testing a single column entry or does it matter?

dmcglinn commented 5 years ago

Ideally the user would be able to specify either the the x or y coordinate, but if that isn't feasible or its too difficult then just stick with coding just x provided and make sure to add documentation that this is how it must be specified. You'll also want to implement a stop() command if only the y argument is provided and not the x.