Closed agnes-duhamet closed 1 year ago
Hi Agnès,
generally, you should scale the variables for prediction in the same way (i.e with the same values) as the variables you used for fitting, you should have no problem. One way to to this is to save the sd() and mean() arguments in the training data, scale by hand, and apply this also to the test data. Alternatively, the applied mean and sd arguments are also included in the output of the scale() function in R.
It is not clear to me if you do this, it sounds a bit as if you are doing separate scaling for training and predictions.
That, however, doesn't explain to me why you get this error - maybe Max can help there!
Thanks. I have tried to merge the dataframe with training data with the dataframe with data for predictions. I applied then scale function for environmental variables and spatial coordinates. So, all was scaled at the same time. I then separate the two dataframes (training and prediction) and try to run the model and I get this error Error : torch._C._LinAlgError: linalg.inv: The diagonal element 3 is zero, the inversion could not be completed because the input matrix is singular
I don't know if it's a correct way to scale variables at the same time?
I saved the sd() and mean() arguments in the training data, scale by hand, and apply this also to the test data as you recommended and it works. Thanks. Agnès
Hi,
OK, odd, it sounds to me as if you were doing it right in the first place, so probably there was some syntax error. OK, but if it works it works so I will close this.
Best F
Hi Maximilian, I would like to model marine fish species distribution in function of distance to the coast, depth, and marine habitat taking into account spatial autocorrelation and species interaction. I wrote the following function: env_var_scaled = env_var %>% mutate(dist_land = scale(dist_land), depth = scale(depth), habitat_principal = droplevels(habitat_principal)) model <- sjSDM(Y = Occ, env = linear(data = env_var_scaled, formula = ~dist_land+depth+habitat_principal), spatial = linear(data = SP %>% scale, formula = ~0+longitude_start_DD:latitude_start_DD), se = TRUE, family=binomial("probit"), sampling = 100L)
I would like now to predict fish species occurrence as function of the environmental variables. I have a spatial grid in which each cell have a value for all environmental variables. I would like to now the occurrence probability of each species in each cell. How can I made it?
p<- predict(model, newdata = env_var_scaled_grid, SP = SP_grid)
The problem is that I scaled variables in the model and I scale variables for each cell in a second time. I think that it's a problem because I will not have the same value for example for a depth of 10 meters in env_var_scaled and env_var_scaled_grid. I tried to scale all the variables at the same time (those of my data that I will use to build the model and those of the grid) but in that case I have this error: Error : torch._C._LinAlgError: linalg.inv: The diagonal element 3 is zero, the inversion could not be completed because the input matrix is singular.
How can I predict species occurrence without having problem with scaling? Thanks in advance, Agnès