GotelliLab / EcoSimR

Repository for EcoSimR, by Gotelli, N.J. , Hart E. M. and A.M. Ellison. 2014. EcoSimR 0.1.0
http://ecosimr.org
Other
27 stars 10 forks source link

Error in `reproduce_model` examples documentation #65

Closed ngotelli closed 9 years ago

ngotelli commented 9 years ago

Hi @emhart

There is a small error in the documentation for the reproduce_model examples code. We have:

## Not run: 
finchMod <- cooc_null_model(dataWiFinches, algo="sim1",saveSeed=T)
## Check model output
mean(finchMod$Sim)

reproduce_model(finchMod$Sim)

finchMod <- cooc_null_model(dataWiFinches, algo="sim1")
## Check model output is the same as before
mean(finchMod$Sim)
reproduce_model(finchMod$Sim)

## End(Not run)

However, in two of the lines, the correct function call should be:

reproduce_model(finchMod)

Since you have already submitted to CRAN, I wasn't sure if it was safe to push any changes to the repo now.

emhart commented 9 years ago

I need to upload it again because I included a '.' where I shouldn't have in the DESCRIPTION file, so you can push the change.

ngotelli commented 9 years ago

OK, I will push it up. Meanwhile, I am having trouble creating a simple example of a user-defined null model that calls null_model_engine. Here is the code:

##################
# Create your own null model
# Simple test for fit of data to a Poisson

# vector set up as a data frame for null_model_engine
MyData <- data.frame(c(0,0,0,1,2,50))
names(MyData) <- "N"

# Calculate the variance to mean ratio of the data
# For a true Poisson, this should ~ 1.0
MyMetric <- function(x=runif(10)){
          VarMeanRatio <- var(x)/mean(x)
             return(VarMeanRatio)
}

# Take a data vector
# Calculate its mean
# Treat that as lambda
# Simulate a data set of the same size
MyAlgo <- function(x=runif(10)){
             lambda <- mean(x)
             sim <- rpois(length(x),lambda)
             return(sim)
}

# functions work on vectors, but this code throws an error:
MyModel <- null_model_engine(speciesData="MyData",algo="MyAlgo", metric="MyMetric", type=NULL)

Is this a problem because I am using a vector rather than a matrix? When this is straightened out, please add this code to the Examples section of the documentation for null_models_engine. We don't have any coding examples there, and it will be important to have a simple example like this one. Thanks!

emhart commented 9 years ago

@ngotelli There are two problems actually. 1). When you make your own function the first arguments need to follow the ecosimr conventions of having the first argument be speciesData and m and 2). The speciesData parameter is actually going to be an object because it's a data frame, vector, etc...However the algo and metric are functions so the only way to pass them in is as strings. So this works fine:

MyData <- c(0,0,0,1,2,50)
colnames(MyData) <- "N"
#Calculate the variance to mean ratio of the data
# For a true Poisson, this should ~ 1.0
MyMetric <- function(m=runif(10)){
  VarMeanRatio <- var(m)/mean(m)
  return(VarMeanRatio)
}

# Take a data vector
# Calculate its mean
# Treat that as lambda
# Simulate a data set of the same size
MyAlgo <- function(speciesData=runif(10)){
  lambda <- mean(speciesData)
  sim <- rpois(length(speciesData),lambda)
  return(sim)
}

# functions work on
MyModel <- null_model_engine(speciesData= MyData ,algo="MyAlgo", metric="MyMetric",nReps = 1000)

summary(MyModel)

plot(MyModel)

Inputting your data as a dataframe will work too...

MyData <- data.frame(c(0,0,0,1,2,50))

ngotelli commented 9 years ago

Thanks; I had tried it before with no quotes on the data frame and still had troubles. I am sure with hard work and study I will eventually master this program! :sheep:

emhart commented 9 years ago

Do you think there's a way we could make it easier and more intuitive?

ngotelli commented 9 years ago

Hi @emhart The null_model_engine function itself seems fine. I think the way to make it usable is to have a couple of examples. Specifically, let's add:

1) The first example that I worked up, which has minimal code for a metric and an algorithm 2) A second example showing how you would use a list to add in more parameters for a function. I will illustrate this with a model in which you draw individuals from a source pool. The source pool has an additional parameter of weights for each species to pass to the sample function. 3) I'd like to add one more example in which the user sets type=cooc and then uses a metric from another package. I am sure there is some kind of diversity index in vegan that would work. I think this third option could be very popular.

I will try to work more on this soon.

N.

ngotelli commented 9 years ago

Hi @emhart . Sorry to bother you again, but I am still having trouble getting null_model_engine to work for me. Here is a simple example that is calling in a list of algorithm options:

# Example #2
# Construct a source pool and a parameter for species weights
# Draw randomly from the source pool and count the number of species present
# This is just a poor man's rarefaction program

# Create the sourcepool of 26 alphabet species
MySourcePool <- paste("Species",LETTERS,sep="")

# Create an island assemblage with 64 individuals and 6 species:
MyData <- paste("Species",c(rep("A",50),rep("B",10),"C","D","E","F"),sep="")

# Create a vector of relative species colonization weights
MyWeights <- sort(rbeta(n=length(MySourcePool),shape1=0.5,shape2=0.5),decreasing=TRUE)

# "algo" function for null model algorithm
# Draw a random sample from the source pool, sampling with replacement and species weights 

MyAlgo <- function(speciesData=runif(10),weights=runif(100),sourcepool=runif(100)){
           NullAssemblage <- sample(x=sourcepool,size=length(speciesData),replace=TRUE,prob=weights)
           return(NullAssemblage)
}

# "metric" function for null model metric
# give the species count for the random sample
MyMetric <- function(m=LETTERS){
            SpeciesCount <- length(unique(m))
            return(SpeciesCount)
}

MyModel <- null_model_engine(speciesData=MyData,algo="MyAlgo",metric="MyMetric",algoOpts=list(weights=MyWeights,sourcepool=MySourcePool))
summary(MyModel)
plot(MyModel)

I have confirmed the behavior of the two functions (MyMetric and MyAlgo) and the proper structure and content of the 3 data vectors (MyData, MyWeights and MySourcePool). null_model_engine gives a complete run, but the simulated and observed data are always zero.

Thanks for your help!

Nick

emhart commented 9 years ago

@ngotelli I tracked this down to some error handling. If a data frame's first column is text, the software strips out the first column and then reclasses the data as numeric. This is to handle the case where a user inputs a matrix with species names in the first column and the rest of the numeric values happen to be text. In your case, all your data is text. It get's replaced with NA's so you get 0.

This behaviour is to try and make data input seamless for the user. However if we just allow text inputs, we lose error handling in the case of a text matrix being entered. So I'm not sure of a way around this.

ngotelli commented 9 years ago

@emhart Ah, that's insidious, and it is the same error that tripped me up when I was testing co-occurrence a while back. To patch this, I will try adding row names to all of my vectors so that when they are stripped out I should still be left with my character vectors.

One possible solution (which does not have to implemented now) would be to create a new function custom_null_model, which is like null_model_engine, but has none of the error trapping or other convenience functions. Then advanced users can create their own null models using this custom function (and being responsible for their own error checks). However, i don't know where all the error checks are located within EcoSimR and if it is possible to do this easily.

A related issue is that it seems counter-intuitive that new functions for algo and metric are required to take inputs of speciesData and m. With this restriction, I don't think it would be possible to use any built-in functions from other packages such as vegan. I don't know if it will be possible in the future to fix this, but it would be a good change to have.

For now, I see you are getting close to having EcoSimR up on CRAN, which is very exciting! Even without the bells and whistles for customized null models, the current package is an excellent contribution and a solid base to build from.

Enjoy your weekend,

N.

emhart commented 9 years ago

@ngotelli actually, it will probably still not work because the data you're passing in is text. No matter what it will try and convert it to a number because it's expecting numeric inputs, so you'll probably get a new error. I didn't realize we would accept text inputs.

Here I try and adopt what you're doing but with counts, not sure if I get at the spirit of what your example was trying...


MyData <- table(paste("Species",c(rep("A",50),rep("B",10),"C","D","E","F"),sep=""))

MyWeights <- sort(rbeta(n=length(MyData),shape1=0.5,shape2=0.5),decreasing=TRUE)

MyAlgo <- function(speciesData,weights) {
  NullAssemblage <- rmultinom(1,size=sum(speciesData),prob=weights)
  return(NullAssemblage)
}

MyMetric <- function(m){
  return(sum(m > 0))
}

MyModel <- null_model_engine(speciesData=MyData,algo="MyAlgo",metric="MyMetric",algoOpts=list(weights=MyWeights))
summary(MyModel)
plot(MyModel)

Regarding the parameter names, the reason I do this is because the null model engine uses do.call() to handle options. I needed to standardize those options for all the functions and the way I did this was just through a parameter naming convention. I've opened an issue on this, #66. I need to do more testing, but we could release an upgrade in a few months that is more flexible and has more intuitive error handling.

ngotelli commented 9 years ago

Hi @emhart Yes, your code does the same thing with numeric variables. I think it is fine to restrict EcoSimR format to numeric to avoid these kind of problems; I just wasn't thinking about it when I started coding. Eventually, I'd like to get both of these (corrected) examples included with the documentation for null_model_engine. But you are in the midst of uploading to CRAN, so I won't make any edits for now. But I will certainly use these examples for the course in Switzerland. Thanks!

We can revisit the general structure of the user-defined null models some time later.

Best,

Nick