Open nikosGeography opened 1 year ago
Dear Nikolaos,
I appreciate your interest in the package.
The function rf_spatial() creates spatial predictors, namely Moran's Eigenvector Maps, from the distance matrix between your training cases. These spatial predictors represent the spatial structure of the data. When included in a model, these spatial predictors help you understand to what extent the spatial distribution of your training cases is influencing your model outcomes. As such, they are only useful in explanatory models, and not so much in predictive ones.
Suppose you need to make predictions from a spatial model. In that case, you need to generate the same Moran's Eigenvector Maps for the training and prediction cases simultaneously (you can do that with the function mem_multithreshold() or with mem()), then separate your training and prediction cases in different data frames, train your model with the training cases, and finally predict over the prediction cases. This will work well if your data is regularly spaced in a grid, but it might not work at all if the data is irregular, because the spatial structures of the training and prediction data can be too different.
In your case, this is even harder because the training and the prediction data have different resolutions, and the spatial structures of both datasets will be rather different.
From my perspective, you have two options:
These things will be easier to do with the next version of the package, but I don't have a release date yet, so you are on your own in the meantime.
I'll be happy to answer any questions though!
Good luck with your analysis,
Blas
Dear @BlasBenito, thank you for your comment. My data set are indeed satellite images but I'm working with their centroids (i.e., points). This means that, they are regularly spaced. Of course, as you said, their spatial structure is different because I'm creating a model at a coarse spatial scale and I want predict at a finer spatial scale.
If you don't mind, I would like to keep this issue open until I manage to predict at a finer spatial scale following the (helpful) steps you wrote.
I don't mind it at all. It might be useful for other people anyway, it's not the first time this question has come up.
@BlasBenito I was wondering, if its not comptationally tractable to calculate a distance matrix for the entire predictive domain, do you think there is any utility in: 1) aggregating the resolution to a manageable size and calculating the Moran's Eigenvector Maps for this coarse-grain data with mem() or mem_multithreshold(), then convert to raster. 2) using terra::extract() for training locations and train a non-spatial model with the Moran's Eigenvector as added co-variates 3) dissaggrating the coarse-grain Eigenvector Maps to a finer resolution, and adding them as covariates in finer-grain data 4) making predictions on this finer resolution data.
Obviously, in this scenario the Moran's Eigenvector values aren't strictly accurate for each prediction location but represent a general value that applies to the aggregated cell.
Can provide a reprex if that would help.
I have one response variable and 4 predictors and I am performing a spatial random forest regression at a coarse spatial scale. My goal is to take the model parameters and apply them to a finer spatial resolution in order to predict the response variable at the finer spatial scale.
When I run
I am getting this error:
Error in predict.ranger.forest(forest, data, predict.all, num.trees, type,: Error: One or more independent variables not found in data.
This error occurs because when I run the spatial version of RF, two more predictors are created.
Given these two extra predictors, how can I use the spatial model i created at the coarse scale to predict the response variable at a finer spatial scale?
Here is the code:
You can download the a small sample of the data from here.