Closed cwberry77 closed 8 months ago
Hi @cwberry77, no problem.
I can't say much without seeing your cross-validation code!
Check whether the length(testSet)
and nrow(test_df_subset)
are the same or not. If they are not, you are probably making a mistake in filtering the original dataframe to the test dataframe.
Hi @rvalavi,
Thanks very much for the prompt reply!
The lengths of the testSet and test_df_subset are different following the removal of the NA values which pop up in the subset. I've added my code below.
Attached here is also a subset of my data if that helps
Thanks again @rvalavi !
[Uploading BlockCV_Subset.zip…]()
You have two objects that is not obvious how they are created:
1) sf_cropped
is used for cv_spatial
and then sf
from line 17 is used for test_table
2) train_df_NA
is also not obviouse where is coming from.
I think the main issue is that you remove NAs during cross-validation in lines 71, and 72. This should be done before creating folds in cv_spatial
.
Hi @rvalavi,
I have added updated code for clarity.
I have tried removing NA's from the rasters prior to performing the cv_spatial function but still seem to be encountering the same issue?
I'm closing this as the problem was solved over email and the issue was not about the blockCV package.
Hi there,
I'll premise this issue with the fact that I'm relatively new to R so are therefore relatively sure I'm making an amatuer mistake.
I'm trying to use cv_spatial to cross validate my rf model. When the folds indexing are applied to my dataframe, I end up with less entries than in the initial trainSet or TestSet. It seems a number of NA values are appearing in this subset of my original dataframe. I'm not sure where these are coming from as my data frame contains no NA values (these have previously been omitted. This is giving me the persistent error of
Warning messages: 1: In test_table$preds[testSet] <- predict(rf_model, newdata = test_df_subset, : number of items to replace is not a multiple of replacement length 2: In test_table$preds[testSet] <- predict(rf_model, newdata = test_df_subset, : number of items to replace is not a multiple of replacement length
I would greatly appreciate any guidance you could offer on this issue. As I said I'm fairly sure I'm doing something fundamentally wrong.
Thanks,
Chris