BiologicalRecordsCentre / sparta

Species Presence/Absence R Trends Analyses
http://biologicalrecordscentre.github.io/sparta/index.html
MIT License
21 stars 24 forks source link

Weights file creation #5

Closed BiologicalRecordsCentre closed 11 years ago

BiologicalRecordsCentre commented 11 years ago

Create a function that generates a weights file from data

drnickisaac commented 11 years ago

We should really have a function to do this, based on 1 or 2 data layers (coordinates and/or landcover and/or other training dataset).

Agnieszka-Doroszuk commented 11 years ago

I've just learned that Rienk-Jan Bijlsma from Alterra (Wageningen, NL) used frescalo last year for a data set on mosses. He used abiotic factors and distances to create weights. I think, I would still prefer to use landcover, as it seems to me a better proxy for habitat type.

Agnieszka-Doroszuk commented 11 years ago

Weights files have been simplified recently (3 columns instead of 6). Is it related to giving up the selection of the 200 of most proximate hectades and 100 of most similar hectades? Is a neighbourhood still composed of 100 hectades?

AugustT commented 11 years ago

Yes, in the original weights files, as created by Mark Hills .exe files, there are 6 or so columns, but only three are used by frescalo. I therefore simplified the requirements in sparta to only the three needed columns

AugustT commented 11 years ago

Neighbourhoods are still 100 hectads but when I write the function to generate a weights file I will add this as a variable so that people have the flexibility to change this number is needed.

AugustT commented 11 years ago

I have written a function that will generate a weights file dataframe for use in frescalo() or sparta(). It is called create_weights() and I have just pushed this to GitHub. Install the latest version of sparta from GitHub (using the instructions on the front page), and then use ?create_weights to read the help file. I have included an example that (hopefully!) makes clear what it does and how it works. Any questions/bugs, please let me know. If you are interested in the inner workings of the function navigate to the R folder on the front page and click on 'create_weights.r ' [this goes for any function in the package]

Agnieszka-Doroszuk commented 11 years ago

It is a valuable addition to the frescalo package. I hope I can use it next week, when I have the landcover data for NL. Would it be better to use percentages of landcover type per site or their areas as the “landcover attribute”?

On the long run, it would be great to have a built-in function to calculate distances from spatial coordinates of sites (instead of calculating them beforehand to include in the dist dataframe)

AugustT commented 11 years ago

I had considered the option of doing the distance calculation within the function as you suggest, with the user supplying the spatial co-ordinates. The reason I did not do that is that there are a number of ways one can calculate the distance between two points on the earths surface and many different co-ordinate systems, which makes writing a function that works for everyone difficult. Also, doing the calculation oneself is not too difficult. Perhaps it would be a good idea if I wrote a simple function that works with X-Y data, giving the euclidean distance [effectively a wrapper for dist()?].

The function will work with either proportions or areas, however, you should use proportions. You can imagine two identical cells next to one another but one is half in the sea. If you used areas these would be very different, using proportion they are the same (which is what we want).